🐉

【Kaggle Method】ISIC2024 4th solution explained

2024/09/19に公開

This time, I'll explain the 4th solution of ISIC2024 Kaggle competition.

Original Solution:
https://www.kaggle.com/competitions/isic-2024-challenge/discussion/532760

0. Highlights

Chapter 2.2: 5 class prediction model, the more granular classes would create better feature representations. (even if we only use target columns for submit, 5 class give model the better ability of feature extraction and expression acquisition, I think)
Chapter 2.3: Trained segmentation models for creating a new dataset from existing datasets.

1. Overview

He created the tabler models and image models, and finally, he used the [tabular model only metadata] and [tabular model using image prediction as input feature] with a 2:8 ratio ensemble.

2. Image models

2.1 Dataset

He used 5 datasets.

・Isic 2024 from this challenge
・Isic 2020: https://challenge.isic-archive.com/data/#2020
・Isic 2019: https://challenge.isic-archive.com/data/#2019
・Isic 2018 for segmentation task: https://challenge.isic-archive.com/data/#2018
・PAD UFES 20: https://data.mendeley.com/datasets/zr7vgbcyr2/1

2.2 Multi-label classification

He used 5 class targets(target, MEL, BCC, SCC, NV) of image models because he felt more granular classes would create better feature representations.

To map the ISIC 2019, 2020, 2024, and PAD UFES labels to the above 5 classes, I used the following rule.
"target" is set to 1 if one of the classes (MEL, BCC, SCC) is 1.

2019 MEL, BCC, SCC, NV -> MEL, BCC, SCC, NV
2020 melanoma -> MEL
2020 nevus -> NV
PAD UFES 20 MEL, BCC, SCC, NEV -> MEL, BCC, SCC, NV
2024 Basal cell carcinoma in iddx_full metadata -> BCC
2024 Melanoma in iddx_full metadata -> MEL
2024 Squamous cell carcinoma in iddx_full metadata -> SCC
2024 Nevus in iddx_full metadata -> NV

2.3 Models

He utilized ten models of 3 types.

  1. Four multi-label classification models (5 classes) trained with ISIC 2024+2020+2019 + PAD UFES, validated with ISIC 2024
    ・swin_tiny image size = 224: OOF score = 0.1609
    ・convnextv2_base image size = 128: OOF score = 0.1641
    ・convnextv2_large image size = 64: OOF score = 0.1642
    ・coatnet_rmlp_1 image size = 224: OOF score = 0.161

  2. Three multi-task segmentation + classification models trained with ISIC 2024+2020+2019 + PAD UFES, validated with ISIC 2024.
    For the submission, He only used the prediction from the classification task.
    ・efficientnet-b3 Unet image size = 224: OOF score = 0.1638
    ・mit-b0 FPN image size = 384: OOF score = 0.1671
    ・mit-b5 FPN image size = 224: OOF score = 0.1656
    Preparation:
    To create masks for the segmentation task, He trained 3 models with ISIC 2018 data and made predictions with ISIC 2024+2020+2019 + PAD UFES
    ・efficientnet-b7 Unet++ image size = 256: IoU = 0.829
    ・efficientnet-b5 Unet++ image size = 512: IoU = 0.827
    ・mit-b5 FPN image size = 512: IoU = 0.843

  3. Three multi-label classification models (5 classes) trained only with ISIC 2024 data
    ・vit_tiny image size = 384: OOF score = 0.1688
    ・swin_tiny image size = 256: OOF score = 0.1655
    ・convnextv2_tiny image size = 288: OOF score = 0.1645

And, Ensemble 10models: OOF score = 0.17458

Secondary Prizes

There are secondary prizes in this competition, the notebooks all submitted CPU-only solutions are automatically considered.
・Top-15 Retrieval Sensitivity (winner receives $7,500)
・Model Efficiency (winner receives $7,500)

For Top-15 retrieval sensitivity prize:
・1 multi-label classification models (5 classes) trained with ISIC 2024+2020+2019 + PAD UFES, validated with ISIC 2024
vit_tiny image size = 224: OOF score = 0.16040
・1 multi-task segmentation + classification models trained with ISIC 2024+2020+2019 + PAD UFES
mit-b0 FPN image size = 224: OOF score = 0.1660

2.4 Image augmentations

transform_train = albu.Compose([
    albu.Resize(self.image_size, self.image_size),
    albu.ImageCompression(quality_lower=80, quality_upper=100, p=0.25),
    albu.ShiftScaleRotate(shift_limit=0.1, scale_limit=0.1, rotate_limit=15, border_mode=0, p=0.5),
    albu.Flip(p=0.5),
    albu.RandomRotate90(p=0.5),
    albu.OneOf([
        albu.MotionBlur(blur_limit=5),
        albu.MedianBlur(blur_limit=5),
        albu.GaussianBlur(blur_limit=5),
        albu.GaussNoise(var_limit=(5.0, 30.0)),
    ], p=0.5),
    albu.RandomBrightnessContrast(p=0.5),
    albu.CoarseDropout(num_holes_range=(1,1), hole_height_range=(8, 32), hole_width_range=(8, 32), p=0.25),
    albu.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
    ToTensorV2(),
])
transform_val = albu.Compose([
    albu.Resize(self.image_size, self.image_size),
    albu.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
    ToTensorV2(),
])

2.5 Exponential Moving Average

For submission, he uses an ema checkpoint (train with full data - no valid) at epoch 8->15, with no TTA.

3. Tabular models

He used 3 models: LightGBM, CatBoost, XGBoost. He combined 10 features from the image pipeline and all the features from the amazing tabular notebook.

・Params

lgbm_params = {
    'objective':        'binary',
    'verbosity':        -1,
    'n_estimators':     300,
    'early_stopping_rounds': 50,
    'metric': 'custom',
    'boosting_type':    'gbdt',
    'lambda_l1':        0.08758718919397321, 
    'lambda_l2':        0.0039689175176025465, 
    'learning_rate':    0.03231007103195577, 
    'max_depth':        4, 
    'num_leaves':       128, 
    'colsample_bytree': 0.8329551585827726, 
    'colsample_bynode': 0.4025961355653304, 
    'bagging_fraction': 0.7738954452473223, 
    'bagging_freq':     4, 
    'min_data_in_leaf': 85, 
    'scale_pos_weight': 2.7984184778875543,
    "device": "gpu"
}
cb_params = {
    'loss_function':     'Logloss',
    'iterations':        300,
    'early_stopping_rounds': 50,
    'verbose':           False,
    'max_depth':         7, 
    'learning_rate':     0.06936242010150652, 
    'scale_pos_weight':  2.6149345838209532, 
    'l2_leaf_reg':       6.216113851699493,
    'min_data_in_leaf':  24,
    'cat_features':      cat_cols,
    "task_type": "CPU",
}
xgb_params = {
    'enable_categorical':       True,
    'tree_method':              'hist',
    'disable_default_eval_metric': 1,
    'n_estimators':             300,
    'early_stopping_rounds':    50,
    'learning_rate':            0.08501257473292347, 
    'lambda':                   8.879624125465703, 
    'alpha':                    0.6779926606782505, 
    'max_depth':                6, 
    'subsample':                0.6012681388711075, 
    'colsample_bytree':         0.8437772277074493, 
    'colsample_bylevel':        0.5476090898823716, 
    'colsample_bynode':         0.9928601203635129, 
    'scale_pos_weight':         3.29440313334688,
    "device":                   "cuda",
}

・Score

LGB pAUC CB pAUC XGB pAUC Gmean pAUC
Only meta feat 0.17806 0.17498 0.17954 0.17879
10model feat 0.18650 0.18669 0.18647 0.18703
2model feat 0.18406 0.18378 0.18328 0.18438

For each model, he tried 40 different seed combinations and blended the top 5 checkpoints with the highest CV score for submission.

4. Final submission

・Final submission

Submission PublicLB PrivateLB Prize
Sub1-GPU: 0.2(meta only) + 0.8(10model feat) 0.18229 0.17225 4th place leaderboard prize
Sub2-CPU: 0.2(meta only) + 0.8(2model feat) 0.18094 0.17011 Top-15 retrieval sensitivity prize

That was his summary. he had a submission that reached private LB = 0.1732 (1st place) by changing the selection of image models, but it wasn't his best CV score.

5. Code

https://github.com/dungnb1333/ISIC-2024

6. Summary

This time, I explained the 4th solution of ISIC2024 on Kaggle, which was a great image models solution.

Discussion