【CV】What is the object detection model? Part2
Object detection explanation part2.
2. Object detection tasks and approaches
BBox detection & classification
There are mainly two-stage and one-stage detection methods for BBox detection & classification.
Two-stage (R-CNN, Fast R-CNN, Faster R-CNN, etc.): Detect candidate object regions and then infer the class within the region
One-stage (anchor-based) (SSD, RetinaNet): Estimate object position and class with the aid of anchor boxes using yellow grids
One-stage (anchor-free) (YOLO): Estimate object position and class without using anchor boxes
Segmentation
Semantic segmentation
The task of classifying objects pixel by pixel
If the class is the same, the same label is used even if the objects are different.
Instance segmentation
Object detection + segmentation
Predict position using bbox + predict object area
Panoptic segmentation
Semantic + instance segmentation
Assign object ID and class to every pixel
Open-Vocabulary Object Detection(2021)
Detect unknown object classes specified in text. Not limited to predefined object classes.
Pose estimation
Extract human joints and facial parts from images. Can be used for sports commentary and human recognition in VR
3D object detection
Detect objects from 3D data instead of images
Object tracking
Track objects in videos online or offline.
Online: Future frames cannot be used
Offline: Future frames can be used
Discussion