llama-quant : correct `n_attention_wv` usage by ddh0 · Pull Request #20357 · ggml-org/llama.cpp

ddh0 · 2026-03-10T17:02:18Z

In #19770, I made a mistake in the way the quantize_state_impl counter values were initialized. I was incrementing and using n_attention_wv in the same loop, when the value should have been fixed by the time we're deciding tensor types in llama_tensor_get_type_impl (for use_more_bits).

I never observed a difference in any of my tests - it was only after @bartowski kindly pointed this out that I realized it was incorrect. Thanks. :)

@bartowski

In ggml-org#19770, I introduced a regression in the way the `quantize_state_impl` counter values were initialized. I was incrementing and using `n_attention_wv` in the same loop, when it should have been fixed by the time we're deciding tensor types in `llama_tensor_get_type_impl` (for `use_more_bits`). I never observed a difference in any of [my tests](ggml-org#19770 (comment)) - it was only after @bartowski kindly pointed this out that I realized it was incorrect. (Thanks!)

@bartowski

* llama-quant : correct `n_attention_wv` usage In ggml-org#19770, I introduced a regression in the way the `quantize_state_impl` counter values were initialized. I was incrementing and using `n_attention_wv` in the same loop, when it should have been fixed by the time we're deciding tensor types in `llama_tensor_get_type_impl` (for `use_more_bits`). I never observed a difference in any of [my tests](ggml-org#19770 (comment)) - it was only after @bartowski kindly pointed this out that I realized it was incorrect. (Thanks!) * simplify

ddh0 requested a review from ggerganov as a code owner March 10, 2026 17:02

simplify

a988640

ggerganov merged commit 10e5b14 into ggml-org:master Mar 10, 2026
13 of 75 checks passed

ddh0 deleted the fix-quant-counter-init branch March 10, 2026 20:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama-quant : correct `n_attention_wv` usage#20357

llama-quant : correct `n_attention_wv` usage#20357
ggerganov merged 2 commits intoggml-org:masterfrom
ddh0:fix-quant-counter-init

ddh0 commented Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ddh0 commented Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants