imatrix: fix oob writes if src1 is not contiguous by JohannesGaessler · Pull Request #13286 · ggml-org/llama.cpp

JohannesGaessler · 2025-05-03T17:04:21Z

The imatrix code implicitly assumes that src1 is contiguous when copying data from a backend to host memory. As a result the vector to which the data is written can end up being resized to a size that is smaller than the amount of data that ggml_backend_tensor_get writes, resulting in out-of-bounds writes.

This PR makes it so that the host buffer always has the exact same size as the amount of data that is being copied. Also, if src1 is not contiguous, then this is considered for calculating the byte addresses of matrix rows.

Unless I'm misunderstanding the code the cases ne12 > 1 and ne13 > 1 are also going to result in unexpected behavior but I don't know what the correct fix would be.

CISC · 2025-05-03T20:01:53Z

tools/imatrix/imatrix.cpp

        LOG_DBGV(2, "%s[%d]: %32s, %s, %5d x %5d, %d\n", __func__, m_last_call, wname.c_str(), ggml_op_name(t->op), (int)src1->ne[0], (int)src1->ne[1], (int)src1->type);
        for (int row = 0; row < (int)src1->ne[1]; ++row) {
-            const float * x = data + row * src1->ne[0];
+            const float * x = (const float *) (data + row * src1->nb[1]);


Excuse my ignorance if I'm asking silly questions, but why nb[1] instead of nb[0]?

nb[0] is the byte offset when changing dimension 0 by 1, nb[1] is the byte offset when changing dimension 1 by 1.

CISC · 2025-05-12T12:14:24Z

@JohannesGaessler In order for you not to have to look at the above linked PR, the solution is to loop over each attention head and compute and store them consecutively, meaning that values/counts must be resized to head_dim * n_head.

imatrix: fix oob writes if src1 is not contiguous

ee690ea

github-actions bot added the examples label May 3, 2025

JohannesGaessler mentioned this pull request May 3, 2025

CUDA: batched+noncont MMQ, refactor bs>1 MoE code #13199

Merged

CISC mentioned this pull request May 3, 2025

Eval bug: b5237 broke Llama Scout #13287

Closed

CISC reviewed May 3, 2025

View reviewed changes

slaren approved these changes May 3, 2025

View reviewed changes

JohannesGaessler merged commit 3e959f0 into ggml-org:master May 3, 2025
95 of 96 checks passed

ikawrakow mentioned this pull request May 12, 2025

Fix imatrix calculation for MLA models ikawrakow/ik_llama.cpp#411

Merged

timwu pushed a commit to timwu/llama.cpp that referenced this pull request Dec 20, 2025

imatrix: fix oob writes if src1 is not contiguous (ggml-org#13286)

f87926f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

imatrix: fix oob writes if src1 is not contiguous#13286

imatrix: fix oob writes if src1 is not contiguous#13286
JohannesGaessler merged 1 commit intoggml-org:masterfrom
JohannesGaessler:deepseek-fix-imatrix

JohannesGaessler commented May 3, 2025

Uh oh!

CISC May 3, 2025

Uh oh!

JohannesGaessler May 3, 2025

Uh oh!

Uh oh!

CISC commented May 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

JohannesGaessler commented May 3, 2025

Uh oh!

CISC May 3, 2025

Choose a reason for hiding this comment

Uh oh!

JohannesGaessler May 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

CISC commented May 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants