【ML Paper】YOLO: Unified Real-Time Object Detection part8
This time, I'll explain the YOLO image detection model with paper.
This is part 8, and part 9 will be published soon.
Original paper: https://arxiv.org/abs/1506.02640
8. Comparison to Fast R-CNN
8.1 VOC 2007 Error Analysis
They tested the YOLO and Fast R-CNN with VOC 2007 Error Analysis.
They use the methodology and tools of Hoiem et al. They look at the top N predictions for each category at test time. Each prediction is either correct or it is classified based on the type of error:
• Correct: correct class and IOU > .5
• Localization: correct class, .1 < IOU < .5
• Similar: class is similar, IOU > .1
• Other: class is wrong, IOU > .1
• Background: IOU < .1 for any object
Results
Figure 4 shows the breakdown of each error type averaged across all 20 classes.
YOLO can detect the correct class better than Fast R-CNN. Fast R-CNN makes much fewer localization errors but far more background errors. 13.6% of
it’s top detections are false positives that don’t contain any
objects.
Fast R-CNN has a high correct rate, but when loc and similar are included, YOLO significantly surpasses it in accuracy.
Discussion