Skip to content

TensorDType::from_u8 missing GGML Q4_0/Q5_0/Q5_1 dtype mappings #375

@noahgift

Description

@noahgift

Description

src/format/v2/tensor_index_impl.rs TensorDType::from_u8() is missing mappings for several GGML quantization types:

  • Q4_0 (GGML type 2) — not mapped at allNone → "invalid dtype" on APR import
  • Q5_0 (GGML type 6) — maps to I8 (wrong, should be Q5_0 or similar)
  • Q5_1 (GGML type 7) — maps to U8 (wrong)
  • Q8_0 (GGML type 8) — maps to Q4 (wrong, should be Q8_0)
  • Q5_K (GGML type 13) — not mapped

Impact

  • apr import of Q4_0-quantized GGUF files fails with "invalid dtype"
  • Qwen2.5-Coder-0.5B (Q4_0) cannot be benchmarked through APR pipeline
  • Other legacy GGML quant types silently get wrong dtype tags

Steps to reproduce

apr import ~/models/qwen2.5-coder-0.5b-instruct-q4_0.gguf -o /tmp/test.apr
apr bench /tmp/test.apr --gpu
# Error: invalid dtype

Expected behavior

All GGML quantization types should have correct mappings in TensorDType::from_u8().

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions