🐢

【Library Method】The useful libraries for image competitions on Kaggle

2024/09/05に公開

This time, I'll introduce useful libraries and tools for image competitions on Kaggle.
Mainly, the summary of this slide.

1. Libraries and Tools for image tasks

1.1 timm

You can easily download and use over 500 different implemented and trained models.

Almost essential for image competitions. I think many people use pytroch for this library.

It is used as a backbone for 2D models.

1.2 mmdetection

MMDetection is an open-source object detection toolbox based on PyTorch. It provides a wide range of tools and models for tasks like object detection and instance segmentation. The framework is highly modular, allowing for flexible customization of components like backbones (feature extractors), necks (feature pyramids), and heads (classification and regression layers).

It is used to identify objects in images or videos, including bounding box prediction and to detect objects and segment them using both bounding boxes and masks.

1.3 YOLOv5/8

YOLOv5 and YOLOv8 are powerful, real-time object detection models with a user-friendly interface and high performance, they are suitable for various detection tasks.
YOLOv8 further enhances the features, making it even more efficient and accurate for modern detection tasks.

1.4 segmentation_models_pytorch

segmentation_models_pytorch (SMP) is a popular open-source Python library for image segmentation tasks built on top of PyTorch. It provides pre-implemented segmentation models like U-Net, FPN, DeepLabV3, and more, with a variety of encoders (e.g., ResNet, EfficientNet) that can be pretrained on popular datasets like ImageNet.

Mainly, it is used for below:
・Image Segmentation
Perform pixel-wise classification to segment objects within an image, making it ideal for medical imaging, satellite imagery, and other use cases.
・Binary and Multiclass Segmentation
Handle both binary segmentation tasks (e.g., foreground vs. background) and multiclass segmentation (e.g., segmenting multiple object classes).
・Transfer Learning
Utilize pretrained encoders to fine-tune models on custom segmentation datasets, speeding up the training process and improving results.

1.5 mmaction2

MMAction2 is a comprehensive toolbox for video action recognition tasks.
It provides a variety of state-of-the-art models for video understanding, including action recognition, temporal action detection, and spatio-temporal action detection. MMAction2 is highly modular, allowing for easy customization and integration of various components like backbones, heads, and datasets, similar to

1.6 MONAI

MONAI (Medical Open Network for AI) is an open-source, PyTorch-based framework specifically designed for deep learning in healthcare imaging. It offers a set of tools, models, and utilities tailored for medical image analysis tasks such as segmentation, classification, and detection. MONAI is developed with support from medical institutions and is designed to streamline the development of AI models in medical applications, including radiology, pathology, and more.

It is used for below:
・Medical Image Segmentation
Segment organs, tumors, or other anatomical structures from medical scans (e.g., MRI, CT).
・Classification and Detection
Classify diseases or detect abnormalities from medical images, including support for multi-modality data.
・3D Medical Imaging
Handle 3D medical data and large volumes with specialized preprocessing and efficient memory management tools.

2. Others

2.1 Albumentation

Albumentations is a library for image augmentation, providing a wide variety of image augmentations.
Many augmentations are easy to write, so they are used very frequently and are effective for many tasks.

2.2 wandb

wandb is a tool for managing machine learning experiments. After a simple setup, you can specify the data you want to track, and it will automatically record and visualize the data.

Training a machine learning model involves many experiments, so a tool that can efficiently manage experiments is very useful.

2.3 pytorch-lightning

pytorch-lightning makes it easier to write pytorch learning loops and inference. Regular pytorch requires detailed training code, but using pytorch-lightning makes coding more efficient, shortens the time it takes to develop a model, and ultimately leads to the development of a better model.

Many companies are already using it in practice, and it is believed to have many advantages.

3. Summary

This time, I introduced libraries and tools that are useful for creating machine learning models. Using these will make your development much more efficient, so please give them a try.

Reference

[1] 競技としてのKaggle、役に立つKaggle

Discussion