Platform: Apple Silicon, GGML Metal enabled
Model: mudler/parakeet-cpp-gguf/tdt-0.6b-v3-q8_0.gguf
Crash during transcription:
ggml_metal_op_encode_impl: error: unsupported op 'CONV_2D_DW'
ggml-metal-ops.cpp:203: unsupported op
After locally replacing ggml_conv_2d_dw_direct with ggml_conv_2d_dw in the subsampling path, that crash goes away, but another Metal failure appears:
ggml_metal_library_compile_pipeline: failed to compile pipeline: base = 'kernel_mul_mv_f32_f16_short'
Error: Function kernel_mul_mv_f32_f16_short was not found in the library
I could get the model running locally by:
- using ggml_backend_sched with [Metal, CPU] instead of a single Metal backend, so unsupported ops can fall back to CPU
- avoiding the f32 x f16 short mul_mv case from the lowered depthwise conv path by using F32 im2col
Would be great if this worked out of the box on macOS. Thanks for your work, this is brilliant. Beats ONNX on Apple by 2x.
Platform: Apple Silicon, GGML Metal enabled
Model: mudler/parakeet-cpp-gguf/tdt-0.6b-v3-q8_0.gguf
Crash during transcription:
ggml_metal_op_encode_impl: error: unsupported op 'CONV_2D_DW'
ggml-metal-ops.cpp:203: unsupported op
After locally replacing ggml_conv_2d_dw_direct with ggml_conv_2d_dw in the subsampling path, that crash goes away, but another Metal failure appears:
ggml_metal_library_compile_pipeline: failed to compile pipeline: base = 'kernel_mul_mv_f32_f16_short'
Error: Function kernel_mul_mv_f32_f16_short was not found in the library
I could get the model running locally by:
Would be great if this worked out of the box on macOS. Thanks for your work, this is brilliant. Beats ONNX on Apple by 2x.