【Kaggle】ISIC2020 11th Solution Explained
This time, I'll write an explanation of ISIC2020 competition's 11th solution.
This solution has many learnable points, let us check it!
0. Summary
・Ensembling of 58 models.
・Using segmentation for make focused(cropped) train image
・malignant upsampling
・pretrain with surrogate label
・10~20 TTA
1. Data
Validation:
・5-fold cross-validation
・2020 data for validation
Training:
・different combinations of 2017/18/19/20 data for training.
・applied image segmentation to detect and crop the lesions and constructed the cropped data set. Several models in the ensemble were trained using the cropped images.
・The malignant images in the training folds were upsampled for some of the models.
2. Preprocessing
wide range of augmentations in different models:
・horizontal/vertical flips
・rotation
・circular crop (a.k.a microscope augmentation)
・dropout
・zoom/brightness adjustment
・color normalization
On the TTA stage, they used the same augmentations as on the training stage and usually varied the number of augmentations between 10 and 20.
3. Models
・After various model explorations, the final model would be efficientnetB5 512x512
・also explored Densenet and Inception architectures but observed worse performance.
・Used pre-trained weights instead of imagenet weights improved CV. using train + test + external data to predict and utilize anatom_site_general_challenge
as a surrogate label.
The best single model was EN-B5 trained on 384x384 with attention layer and meta-features, which achieved a private LB of 0.9380.
4. Ensembling
they made 91 models finally, and filtered models by two criteria:
- Removing models where the mean correlation of predictions with the other models demonstrated a large gap between OOF/test predictions
- Removing models that ranked high in the adversarial validation model
explain of 1:
Imagine the correlation between their OOF predictions is 0.8, but the correlation between their test predictions is 0.5. The gap is therefore 0.3. A large gap means that model predictions behave differently between local validation and test set, so one of the models may overfit the local data.
So remove the models that have a big gap because they may have a risk of overfitting.
This reduced their set to 58 models that will be ensembled.
5. submission
Finally, they submitted three models the "most robust one", "best CV", "best LB", and the best CV model get the best PB score and 11th in the competition.
Discussion