ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl#15928
ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl#15928
Conversation
ggml/src/ggml-quants.c
Outdated
| } | ||
| float best = 0; | ||
| float scale = max/(2*kMaxQ-1); | ||
| for (int k = 0; k < 8; ++k) is_on_grid[k] = false; |
There was a problem hiding this comment.
If the goal is only to silence a compiler warning, then I would just zero-initialize the variable where it is declared. If you suspect that this is actually a bug that is leading to wrong results, then I think it needs more explanation.
There was a problem hiding this comment.
It's fixing an actual bug, depending on the condition. It will potentially use uninitialized (or initialized from previous loop) data depending on this being done or not:
llama.cpp/ggml/src/ggml-quants.c
Lines 3750 to 3754 in f50c60d
See the other quants for reference:
llama.cpp/ggml/src/ggml-quants.c
Line 3283 in f50c60d
llama.cpp/ggml/src/ggml-quants.c
Line 3931 in f50c60d
llama.cpp/ggml/src/ggml-quants.c
Line 4883 in f50c60d
There was a problem hiding this comment.
To me, it is not obvious that this will change the results, or if it does, that it needs to be initialized to false instead of true (or other values).
There was a problem hiding this comment.
It is indeed not obvious, it should perhaps be set to true, or maybe even do as quantize_row_iq3_s_impl where it is set to false:
llama.cpp/ggml/src/ggml-quants.c
Line 3968 in f50c60d
There was a problem hiding this comment.
Either way it is potentially using uninitialized data right now. The safest choice AFAICT is initializing it to falsetrue.
There was a problem hiding this comment.
Again, what is it fixing? Is the change meaningful, or is it just adding more noise?
There was a problem hiding this comment.
It's fixing uninitialized data, nothing more or less, exactly like in all the other quants.
There was a problem hiding this comment.
..and yes, this will only happen when the weights are zero, but this has happened enough times that we have added several checks against it:
llama.cpp/ggml/src/ggml-quants.c
Line 493 in f50c60d
llama.cpp/ggml/src/ggml-quants.c
Line 569 in f50c60d
llama.cpp/ggml/src/ggml-quants.c
Line 969 in f50c60d
There was a problem hiding this comment.
It seems that if all weights are zero, scale will also be zero, and the branch that uses is_on_grid will be ignored.
There was a problem hiding this comment.
It seems that if all weights are zero,
scalewill also be zero, and the branch that usesis_on_gridwill be ignored.
Not necessarily, only if the whole block is zero, however that may not be the case, also scale is based off the original weights, while the weights in question are the ones after imatrix is applied (ie, the imatrix may cause non-zero parts to go to zero).
If the original weights are all zero (or close) |
|
Hi! Any update on this? It's one of the items preventing #15925 from passing CI tests due to |
I will double check |
|
Since imatrix weights are unlikely to be just partially zero it means the whole block will be on grid in the event of weights being zeroed by imatrix, and then |
* origin/master: (39 commits) ci : disable AMD workflows + update NVIDIA workflows (ggml-org#16200) ci : enable Vulkan workflow on Mac (ggml-org#16194) ggml-cpu: Respect cpumask settings (ggml-org#16164) ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (ggml-org#15928) zdnn: refactor codebase + add docs (ggml-org#16178) codeowners : add @danbev to model-conversion example [no ci] (ggml-org#16190) devops: add s390x containers (ggml-org#15915) ggml-cpu : fix typo in gemm comments [no ci] (ggml-org#16189) feat: Add conversion support in GraniteHybrid for non-hybrid (all attn) (ggml-org#16177) clang-tidy : disable warning about performance enum size (ggml-org#16127) ggml : implement set_rows with i32 index (ggml-org#16159) codeowners : update + cleanup (ggml-org#16174) common : enable `--offline` mode without curl support (ggml-org#16137) webui : fix handling incomplete chunks (ggml-org#16107) embedding : fix typos in README (ggml-org#16171) common : remove unused local variables (ggml-org#16140) ggml : extend ggml_can_fuse to work with non-sequential nodes (ggml-org#16123) ggml : add ggml_op_is_empty (ggml-org#16122) codeowners : update ownership for @ngxson and @allozuar (ggml-org#16128) Vulkan: add conv_transpose_2d operation (ggml-org#16022) ...
…l-org#15928) * fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl * change initialization to true
…l-org#15928) * fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl * change initialization to true
…l-org#15928) * fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl * change initialization to true
…928) * fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl * change initialization to true
See https://github.com/ggml-org/llama.cpp/actions/runs/17615331967/job/50046823487#step:5:87