Skip to content

Metal backend crashes on macOS arm64 with tdt-0.6b-v3-q8_0.gguf #2

@badlogic

Description

@badlogic

Platform: Apple Silicon, GGML Metal enabled
Model: mudler/parakeet-cpp-gguf/tdt-0.6b-v3-q8_0.gguf

Crash during transcription:

ggml_metal_op_encode_impl: error: unsupported op 'CONV_2D_DW'
ggml-metal-ops.cpp:203: unsupported op

After locally replacing ggml_conv_2d_dw_direct with ggml_conv_2d_dw in the subsampling path, that crash goes away, but another Metal failure appears:

ggml_metal_library_compile_pipeline: failed to compile pipeline: base = 'kernel_mul_mv_f32_f16_short'
Error: Function kernel_mul_mv_f32_f16_short was not found in the library

I could get the model running locally by:

  1. using ggml_backend_sched with [Metal, CPU] instead of a single Metal backend, so unsupported ops can fall back to CPU
  2. avoiding the f32 x f16 short mul_mv case from the lowered depthwise conv path by using F32 im2col

Would be great if this worked out of the box on macOS. Thanks for your work, this is brilliant. Beats ONNX on Apple by 2x.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions