convert : fix RuntimeError when stripping FP8 KV-cache scales by pich · Pull Request #22818 · ggml-org/llama.cpp

pich · 2026-05-07T19:57:53Z

Bug

convert_hf_to_gguf.py raises RuntimeError: dictionary changed size during iteration for any ModelOpt-quantised NVFP4 checkpoint that also has FP8 KV-cache scales — e.g. mmangkad/Qwen3.6-35B-A3B-NVFP4 (and any hf_quant_config.json with kv_cache_quant_algo: FP8).

File "convert_hf_to_gguf.py", line 721, in _generate_nvfp4_tensors
    for name in self.model_tensors.keys():
RuntimeError: dictionary changed size during iteration

Cause

In ModelBase._generate_nvfp4_tensors the final cleanup loop iterates self.model_tensors.keys() while calling del self.model_tensors[name] on the same dict. As soon as the first .k_scale / .v_scale tensor is found, the iterator invalidates.

The earlier loops in the same function avoid this by collecting names into consumed and popping them after iteration; this trailing loop was missed.

Fix

Wrap .keys() in list() so the deletions happen against a snapshot. One-line change.

Repro

hf download mmangkad/Qwen3.6-35B-A3B-NVFP4 --local-dir model-nvfp4
python convert_hf_to_gguf.py model-nvfp4 --outfile out.gguf --outtype auto

Without the patch: crashes after repacking experts. With the patch: writes out.gguf cleanly (973 tensors, 22 GB). Verified the resulting GGUF loads in llama-bench / llama-perplexity and gives sensible results on RTX PRO 4000 Blackwell (sm_120, native NVFP4).

In ModelBase._generate_nvfp4_tensors the final cleanup loop iterates self.model_tensors.keys() and calls del on the same dict, which raises RuntimeError: dictionary changed size during iteration when a ModelOpt NVFP4 model also has FP8 KV-cache scales (e.g. mmangkad/Qwen3.6-35B-A3B-NVFP4 and any modelopt config with kv_cache_quant_algo: FP8). Wrap the keys view in list() so the deletions happen on a snapshot.

CISC

Oops, my bad, I removed a bunch of pointless lists, and didn't spot that this one was necessary, thanks!

pich

we need second Code Review, i'm not able to approve your change

CISC · 2026-05-07T20:37:20Z

we need second Code Review, i'm not able to approve your change

Don't worry about it, I'll set it as merge ready once CIs go green.

…rg#22818) * convert : fix RuntimeError when stripping FP8 KV-cache scales In ModelBase._generate_nvfp4_tensors the final cleanup loop iterates self.model_tensors.keys() and calls del on the same dict, which raises RuntimeError: dictionary changed size during iteration when a ModelOpt NVFP4 model also has FP8 KV-cache scales (e.g. mmangkad/Qwen3.6-35B-A3B-NVFP4 and any modelopt config with kv_cache_quant_algo: FP8). Wrap the keys view in list() so the deletions happen on a snapshot. * re-add another accidentally removed list --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

Ports 15 upstream commits (05e141a..5d44db6) that touched the monolithic convert_hf_to_gguf.py into the new conversion/*.py layout introduced by the refactor split. New text/mmproj architectures registered: GraniteSpeechForConditionalGeneration, MiMoV2ForCausalLM, MiniCPMV4_6ForConditionalGeneration, Sarashina2VisionForCausalLM, SarvamMoEForCausalLM (+ modeling_sarvam_moe.SarvamMoEForCausalLM). Notable changes: - filter_tensors classmethod added to ModelBase/TextModel/MmprojModel and wired into index_tensors; many model classes refactored to move tensor-name skip/rename logic out of modify_tensors and into filter_tensors (upstream ggml-org#22597). - LlamaModel._repack_nvfp4 override (Q/K RoPE permutation, ggml-org#22611). - MistralModel yarn apply_scale support (ggml-org#22612). - Gemma4Model._generate_nvfp4_tensors override for 26B NVFP4 (ggml-org#22804). - LlavaVisionModel image-break token fallback for Mistral params.json -1 placeholders (ggml-org#22914). - Pixtral 12B --mistral-format conversion fixes (ggml-org#22981). - FP8 KV-cache scales fix (ggml-org#22818) and uint dtype byteswap disable (ggml-org#18908). New files: conversion/sarashina2.py (Sarashina2VL text + vision)

…rg#22818) * convert : fix RuntimeError when stripping FP8 KV-cache scales In ModelBase._generate_nvfp4_tensors the final cleanup loop iterates self.model_tensors.keys() and calls del on the same dict, which raises RuntimeError: dictionary changed size during iteration when a ModelOpt NVFP4 model also has FP8 KV-cache scales (e.g. mmangkad/Qwen3.6-35B-A3B-NVFP4 and any modelopt config with kv_cache_quant_algo: FP8). Wrap the keys view in list() so the deletions happen on a snapshot. * re-add another accidentally removed list --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

pich requested a review from CISC as a code owner May 7, 2026 19:57

This comment was marked as off-topic.

Sign in to view

CISC approved these changes May 7, 2026

View reviewed changes

re-add another accidentally removed list

3028001

pich commented May 7, 2026

View reviewed changes

pich requested a review from CISC May 7, 2026 20:33

github-actions Bot added the python python script changes label May 7, 2026

CISC added the merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. label May 7, 2026

ggerganov merged commit 1d72d87 into ggml-org:master May 8, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert : fix RuntimeError when stripping FP8 KV-cache scales#22818

convert : fix RuntimeError when stripping FP8 KV-cache scales#22818
ggerganov merged 2 commits into
ggml-org:masterfrom
pich:fix/nvfp4-convert-dict-iter

pich commented May 7, 2026

Uh oh!

This comment was marked as off-topic.

CISC left a comment

Uh oh!

pich left a comment •

edited

Loading

Uh oh!

CISC commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pich commented May 7, 2026

Bug

Cause

Fix

Repro

Uh oh!

This comment was marked as off-topic.

CISC left a comment

Choose a reason for hiding this comment

Uh oh!

pich left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CISC commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pich left a comment •

edited

Loading