🐘

【CV】What is the object detection model? Part2

2024/10/12に公開

Object detection explanation part2.

 2. Object detection tasks and approaches
 BBox detection & classificationThere are mainly two-stage and one-stage detection methods for BBox detection & classification.

Two-stage (R-CNN, Fast R-CNN, Faster R-CNN, etc.): Detect candidate object regions and then infer the class within the region

One-stage (anchor-based) (SSD, RetinaNet): Estimate object position and class with the aid of anchor boxes using yellow grids

One-stage (anchor-free) (YOLO): Estimate object position and class without using anchor boxes

 SegmentationSemantic segmentationThe task of classifying objects pixel by pixel

If the class is the same, the same label is used even if the objects are different.
Instance segmentationObject detection + segmentation

Predict position using bbox + predict object area
Panoptic segmentationSemantic + instance segmentation

Assign object ID and class to every pixel

 Open-Vocabulary Object Detection(2021)Detect unknown object classes specified in text. Not limited to predefined object classes.

 Pose estimationExtract human joints and facial parts from images. Can be used for sports commentary and human recognition in VR

 3D object detectionDetect objects from 3D data instead of images

 Object trackingTrack objects in videos online or offline.

Online: Future frames cannot be used

Offline: Future frames can be used

2. Object detection tasks and approaches

BBox detection & classification

Segmentation

Semantic segmentation

Instance segmentation

Panoptic segmentation

Open-Vocabulary Object Detection(2021)

Pose estimation

3D object detection

Object tracking

Discussion