Skip to content

Misc. bug: HF Cache Migration Only Moves Part 00001 #21015

@Beinsezii

Description

@Beinsezii

Name and Version

ggml_cuda_init: found 1 ROCm devices (Total VRAM: 114688 MiB):
  Device 0: AMD Radeon Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 114688 MiB
version: 8530 (a970515bd)
built with GNU 13.3.0 for Linux x86_64

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

`llama-server -hf Qwen/Qwen3-Coder-Next-GGUF:Q8_0`

Problem description & steps to reproduce

Split GGUF such as Qwen/Qwen3-Coder-Next-GGUF:Q8_0 only get the first shard automatically migrated to HF.

First Bad Commit

No response

Relevant log output

> ls ~/.cache/huggingface/hub/models--Qwen--Qwen3-Coder-Next-GGUF/snapshots/b82fb7382639d97b38fa7672e526c760c2fb358e/Qwen3-Coder-Next-Q8_0/
Qwen3-Coder-Next-Q8_0-00001-of-00004.gguf

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions