✍️
homebrewでインストールしたwhisper-cppをコマンドラインから呼ぶ
セットアップ
brew install whisper-cpp
GPUを使いたい場合は、 ..share/whisper-cpp
のディレクトリをexportすればOK
※brewのformulaを読むとggml-metal.metalをコピーしていたのでこれを指定している。
export GGML_METAL_PATH_RESOURCES="$(brew --prefix)/opt/whisper-cpp/share/whisper-cpp"
モデルのダウンロードは、 https://github.com/ggerganov/whisper.cpp/tree/master/models が詳しい。
呼び出す
whisper-cpp /opt/homebrew/opt/whisper-cpp/share/whisper-cpp/jfk.wav -m whisper.cpp/models/ggml-medium-q5_0.bin -l ja
...
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 1024
whisper_model_load: n_text_head = 16
whisper_model_load: n_text_layer = 24
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 8
whisper_model_load: qntvr = 1
whisper_model_load: type = 4 (medium)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M1 Max
ggml_metal_init: picking default device: Apple M1 Max
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = /opt/homebrew/opt/whisper-cpp/share/whisper-cpp
ggml_metal_init: loading '/opt/homebrew/opt/whisper-cpp/share/whisper-cpp/ggml-metal.metal'
ggml_metal_init: GPU name: Apple M1 Max
ggml_metal_init: GPU family: MTLGPUFamilyApple7 (1007)
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 22906.50 MB
ggml_metal_init: maxTransferRate = built-in GPU
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 514.02 MiB, ( 515.64 / 21845.34)
whisper_model_load: Metal buffer size = 538.97 MB
whisper_model_load: model size = 538.59 MB
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M1 Max
ggml_metal_init: picking default device: Apple M1 Max
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = /opt/homebrew/opt/whisper-cpp/share/whisper-cpp
ggml_metal_init: loading '/opt/homebrew/opt/whisper-cpp/share/whisper-cpp/ggml-metal.metal'
ggml_metal_init: GPU name: Apple M1 Max
ggml_metal_init: GPU family: MTLGPUFamilyApple7 (1007)
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 22906.50 MB
ggml_metal_init: maxTransferRate = built-in GPU
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 126.00 MiB, ( 641.64 / 21845.34)
whisper_init_state: kv self size = 132.12 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 140.62 MiB, ( 782.27 / 21845.34)
whisper_init_state: kv cross size = 147.46 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 0.02 MiB, ( 782.28 / 21845.34)
whisper_init_state: compute buffer (conv) = 25.61 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 0.02 MiB, ( 782.30 / 21845.34)
whisper_init_state: compute buffer (encode) = 170.28 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 0.02 MiB, ( 782.31 / 21845.34)
whisper_init_state: compute buffer (cross) = 7.85 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 0.02 MiB, ( 782.33 / 21845.34)
whisper_init_state: compute buffer (decode) = 98.32 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 22.80 MiB, ( 805.11 / 21845.34)
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 160.78 MiB, ( 965.88 / 21845.34)
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 5.86 MiB, ( 971.72 / 21845.34)
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 92.14 MiB, ( 1063.84 / 21845.34)
system_info: n_threads = 4 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 |
main: processing '/opt/homebrew/opt/whisper-cpp/share/whisper-cpp/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = ja, task = transcribe, timestamps = 1 ...
[00:00:00.000 --> 00:00:11.000] そして、私の友人アメリカ人たち、あなたの国が何をするかを問わないでください。あなたの国が何をするかを問わないでください。
whisper_print_timings: load time = 243.62 ms
whisper_print_timings: fallbacks = 1 p / 0 h
whisper_print_timings: mel time = 8.06 ms
whisper_print_timings: sample time = 189.21 ms / 471 runs ( 0.40 ms per run)
whisper_print_timings: encode time = 599.01 ms / 1 runs ( 599.01 ms per run)
whisper_print_timings: decode time = 1768.32 ms / 185 runs ( 9.56 ms per run)
whisper_print_timings: batchd time = 1973.47 ms / 282 runs ( 7.00 ms per run)
whisper_print_timings: prompt time = 0.00 ms / 1 runs ( 0.00 ms per run)
whisper_print_timings: total time = 4786.81 ms
ggml_metal_free: deallocating
ggml_metal_free: deallocating
その他
rubyのbindingが公開されているのだけど動かなかった。
issueにはなっているけど、特にレスはなし
以上。
Discussion