Name and Version
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 114688 MiB):
Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 114688 MiB
version: 8530 (a970515bd)
built with GNU 13.3.0 for Linux x86_64
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Command line
`llama-server -hf Qwen/Qwen3-Coder-Next-GGUF:Q8_0`
Problem description & steps to reproduce
Split GGUF such as Qwen/Qwen3-Coder-Next-GGUF:Q8_0 only get the first shard automatically migrated to HF.
First Bad Commit
No response
Relevant log output
> ls ~/.cache/huggingface/hub/models--Qwen--Qwen3-Coder-Next-GGUF/snapshots/b82fb7382639d97b38fa7672e526c760c2fb358e/Qwen3-Coder-Next-Q8_0/
Qwen3-Coder-Next-Q8_0-00001-of-00004.gguf
Name and Version
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Command line
`llama-server -hf Qwen/Qwen3-Coder-Next-GGUF:Q8_0`Problem description & steps to reproduce
Split GGUF such as
Qwen/Qwen3-Coder-Next-GGUF:Q8_0only get the first shard automatically migrated to HF.First Bad Commit
No response
Relevant log output