✔️

deepfaceによる顔検知の精度と速度の比較

2024/05/01に公開

顔検知の各種アルゴリズムの精度とCPUでの速度について比較しました。
deepfaceでbackendが簡単に切り替えられましたので、そちらで計測しました。

結論としては、
バランス重視なら「yolov8」
速度重視なら「yunet」
がよさそうです。


準備

サンプル画像

https://www.pakutaso.com/20240410100post-51019.html
https://www.pakutaso.com/20240451093post-50975.html
https://www.pakutaso.com/20240344079post-47440.html
https://www.pakutaso.com/20240301078post-50861.html
https://www.pakutaso.com/20240253033post-50453.html
https://www.pakutaso.com/20240158001post-50035.html
https://www.pakutaso.com/20231216347post-48946.html
https://www.pakutaso.com/20231231347post-48945.html
https://www.pakutaso.com/20231258347post-48944.html
https://www.pakutaso.com/20170628178post-12228.html

テストコード

%%capture
!pip install deepface facenet-pytorch mediapipe ultralytics
IMAGES = ['adumaHFKE1621_TP_V4.jpg', 'bizIMG_6054_TP_V4.jpg', 'kagaminIMG_8123_TP_V4.jpg', 'nonoss513dsE049_TP_V4.jpg', 'partyHFKE9431_TP_V4.jpg', 'SAKI037-_MKT59151825_TP_V4.jpg', 'SAKI037-_MKT59171826_TP_V4.jpg', 'SAKI037-_MKT59181827_TP_V4.jpg', 'sakisanHFKE7233_TP_V4.jpg', 'HIROTA17621035_TP_V4.jpg']

import cv2
import time
from deepface import DeepFace

backends = [
  'opencv', 
  'ssd', 
  'dlib', 
  'mtcnn', 
  'fastmtcnn',
  'retinaface', 
  'mediapipe',
  'yolov8',
  'yunet',
  'centerface',
]

for backend in backends:
  for image in IMAGES:
    face_objs = DeepFace.extract_faces(img_path = image, detector_backend = backend, enforce_detection = False, align = False)

    img = cv2.cvtColor(cv2.imread(image), cv2.COLOR_BGR2RGB)
    for face_obj in face_objs:
      box = face_obj['facial_area']
      if box['left_eye'] is not None:
        cv2.rectangle(img, (box['x'], box['y']), (box['x'] + box['w'], box['y'] + box['h']), (255, 0, 0), 5)
    cv2.imwrite(f'{backend}_{image}', cv2.cvtColor(img, cv2.COLOR_RGB2BGR))

  s = time.perf_counter()
  for i in range(10):
    for image in IMAGES:
      face_objs = DeepFace.extract_faces(img_path = image, detector_backend = backend, enforce_detection = False, align = False)
  print(f'{backend}: {(time.perf_counter()-s)/100} s')

結果

速度

10枚x10回処理した際の1枚当たりの秒数
mediapipe(0.012s) > yunet(0.045s) > ssd(0.088s) >> yolov8(0.194s) >> opencv(0.340s) > dlib(0.361s) > fastmtcnn(0.366s) > centerface(0.380s) >> mtcnn(1.617s) >> retinaface(8.862s)

精度

検知した顔の数
yolov8(37) > retinaface(34) > fastmtcnn(26) > mtcnn(25) >> centerface(12) > dlib(9) = yunet(9) > mediapipe(4) > ssd(2) > opencv(1)

mediapipe

0.012 s/枚

yunet

0.045 s/枚

ssd

0.088 s/枚

yolov8

0.194 s/枚

opencv

0.340 s/枚

dlib

0.361 s/枚

fastmtcnn

0.366 s/枚

centerface

0.380 s/枚

mtcnn

1.617 s/枚

retinaface

8.862 s/枚

Discussion