Downloaded the safetensors version and converted it. The conversion to APR works but then I can't chat:
apr convert ~/Downloads/qwen2.5-coder-1.5b.safetensors --output qwen2.5-coder-1.5b-instruct.apr
=== APR Convert ===
Input: /home/alfredo/Downloads/qwen2.5-coder-1.5b.safetensors
Output: qwen2.5-coder-1.5b-instruct.apr
Quantization: None (copy)
Converting...
=== Conversion Report ===
Original size: 5.75 GiB
Converted size: 5.75 GiB
Tensors: 338
Reduction: -0.0% (1.00x)
⚠ Conversion completed (output larger than input)
=== Model Chat (APR Format) ===
Using APR v2 format with mmap (Native Library Mandate)
Model: qwen2.5-coder-1.5b-instruct.apr
Chat Template: ChatML
Temperature: 0.7
Top-P: 0.9
Max Tokens: 512
Commands:
/quit Exit the chat
/clear Clear conversation history
/system Set system prompt
/help Show help
════════════════════════════════════════════════════════════
Loading model...
Loaded SafeTensors format in 0.92s (6174.9 MB)
Loaded tokenizer: tokenizer.json (151936 tokens)
Loaded 219 tensors from SafeTensors
Detected ChatML chat template
You: hey
thread 'main' (3360593) panicked at src/autograd/ops.rs:539:9:
assertion `left == right` failed: matmul dimension mismatch: 896 vs 1536
left: 896
right: 1536
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Downloaded the safetensors version and converted it. The conversion to APR works but then I can't chat: