ggml-alloc: fix discrepency between measure&eval by lshzh-ww · Pull Request #2639 · ggml-org/llama.cpp

lshzh-ww · 2023-08-17T01:26:42Z

The GGML memory allocator consistently places a tensor within the optimal-fit memory block, which is the smallest block capable of accommodating the tensor's size. During the measurement phase, the final block is generously sized, ensuring it never qualifies as the optimal-fit block as long as there exists another block capable of accommodating the tensor. Nevertheless, in the evaluation phase, the last block is constrained in size and could potentially qualify as the optimal-fit block. Consequently, there exists the possibility of a tensor being allocated to a different region during evaluation, leading to more memory fragmentation in our scratch buffer in the evaluation phase.

This commit guarantees uniform behavior of the allocator across both the measurement and evaluation phases, eliminating discrepancies between the two.

The GGML memory allocator consistently places a tensor within the optimal-fit memory block, which is the smallest block capable of accommodating the tensor's size. During the measurement phase, the final block is generously sized, ensuring it never qualifies as the optimal-fit block as long as there exists another block capable of accommodating the tensor. Nevertheless, in the evaluation phase, the last block is constrained in size and could potentially qualify as the optimal-fit block. Consequently, there exists the possibility of a tensor being allocated to a different region during evaluation, leading to more memory fragmentation in our scratch buffer. This recent commit guarantees uniform behavior of the allocator across both the measurement and evaluation phases, eliminating discrepancies between the two.

slaren

Good fix, looks like this could be the cause of the issue with Metal, but I cannot reproduce it with the CPU backend.

lshzh-ww · 2023-08-17T02:06:13Z

In the context of Metal, it only triggered when using a reordered graph and a 70B model, which seems to be a corner case, I suppose.

ggerganov

Thank you for the detailed description!

Confirm the issue is fixed 👍

lshzh-ww requested a review from slaren August 17, 2023 01:26

lshzh-ww mentioned this pull request Aug 17, 2023

metal: enable ggml-alloc #2627

Merged

slaren approved these changes Aug 17, 2023

View reviewed changes

ggerganov approved these changes Aug 17, 2023

View reviewed changes

ggerganov merged commit a872a2b into ggml-org:master Aug 17, 2023

slaren mentioned this pull request Jan 8, 2024

Simplify tensor allocation logic. #4829

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-alloc: fix discrepency between measure&eval#2639

ggml-alloc: fix discrepency between measure&eval#2639
ggerganov merged 1 commit intoggml-org:masterfrom
lshzh-ww:ggml-alloc-fix

lshzh-ww commented Aug 17, 2023

Uh oh!

slaren left a comment

Uh oh!

lshzh-ww commented Aug 17, 2023

Uh oh!

ggerganov left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lshzh-ww commented Aug 17, 2023

Uh oh!

slaren left a comment

Choose a reason for hiding this comment

Uh oh!

lshzh-ww commented Aug 17, 2023

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants