Fix memory management bug in llava and server code by Elbios · Pull Request #5491 · ggml-org/llama.cpp

Elbios · 2024-02-14T16:16:24Z

Fixes this error:

llama_new_context_with_model: graph splits (measure): 3 Available slots:
-> Slot 0 - max context: 6000
{"timestamp":1707926446,"level":"INFO","function":"main","line":2623,"message":"model loaded"} all slots are idle and system prompt is empty, clear the KV cache slot 0 - loaded image
slot 0 is processing [task id: 0]
slot 0 : kv cache rm - [0, end)
slot 0 - encoding image [id: 1]
munmap_chunk(): invalid pointer
Aborted

when running the server binary like this:

./bin/server -m ../models/mistral-7b-q_5_k.gguf --mmproj ../models/mmproj-mistral7b-f16-q6_k.gguf -ngl 50 -c 6000 --host 0.0.0.0 --port 8007 --no-mmap

Tested on:
Linux, WSL (Debian)
GPU: 4090

Fixes this error: llama_new_context_with_model: graph splits (measure): 3 Available slots: -> Slot 0 - max context: 6000 {"timestamp":1707926446,"level":"INFO","function":"main","line":2623,"message":"model loaded"} all slots are idle and system prompt is empty, clear the KV cache slot 0 - loaded image slot 0 is processing [task id: 0] slot 0 : kv cache rm - [0, end) slot 0 - encoding image [id: 1] munmap_chunk(): invalid pointer Aborted

ggerganov · 2024-02-14T16:45:41Z

examples/llava/clip.h

+CLIP_API void clip_image_u8_batch_free (struct clip_image_u8  * data);
+CLIP_API void clip_image_f32_batch_free(struct clip_image_f32 * data);


Wouldn't it be better to change these to:

CLIP_API void clip_image_u8_batch_free (struct clip_image_u8_batch * batch) [ if (batch.size > 0) { delete[] batch.data; } batch.size = 0; }

Agreed, I changed it and retested.

Elbios · 2024-02-14T18:47:41Z

examples/llava/clip.cpp

        pad_to_square = false;
    }
    // free the previous res_imgs if any set
-    if (res_imgs.size > 0 && res_imgs.size < 100) {


oh, I removed the upper bound because there didn't seem to be any justification for it, but if there is then let me know @cmp-nct and I'll restore it

oh, I removed the upper bound because there didn't seem to be any justification for it, but if there is then let me know @cmp-nct and I'll restore it

The reason for the upper bound was a safety check in case the passed structure points to uninitialized memory, in that case it would almost certainly be outside that range.
So only relevant if someone uses it wrong, I'm fine either way.

Glad you spotted the double free, it's another remnant of the vector->pointer refactor.

* Fix memory management in llava and server code Fixes this error: llama_new_context_with_model: graph splits (measure): 3 Available slots: -> Slot 0 - max context: 6000 {"timestamp":1707926446,"level":"INFO","function":"main","line":2623,"message":"model loaded"} all slots are idle and system prompt is empty, clear the KV cache slot 0 - loaded image slot 0 is processing [task id: 0] slot 0 : kv cache rm - [0, end) slot 0 - encoding image [id: 1] munmap_chunk(): invalid pointer Aborted * Make it cleaner by checking size in batch free wrapper

Elbios mentioned this pull request Feb 14, 2024

Llava 1.6 support #5267

Merged

ggerganov reviewed Feb 14, 2024

View reviewed changes

Make it cleaner by checking size in batch free wrapper

098ab94

Elbios commented Feb 14, 2024

View reviewed changes

cmp-nct mentioned this pull request Feb 15, 2024

Some memory management bugs #5498

Closed

ggerganov approved these changes Feb 15, 2024

View reviewed changes

ggerganov merged commit 0d41771 into ggml-org:master Feb 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix memory management bug in llava and server code#5491

Fix memory management bug in llava and server code#5491
ggerganov merged 2 commits intoggml-org:masterfrom
Elbios:fix_memory_management_llava

Elbios commented Feb 14, 2024 •

edited

Loading

Uh oh!

ggerganov Feb 14, 2024

Uh oh!

Elbios Feb 14, 2024

Uh oh!

Elbios Feb 14, 2024

Uh oh!

cmp-nct Feb 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		CLIP_API void clip_image_u8_batch_free (struct clip_image_u8 * data);
		CLIP_API void clip_image_f32_batch_free(struct clip_image_f32 * data);

Conversation

Elbios commented Feb 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov Feb 14, 2024

Choose a reason for hiding this comment

Uh oh!

Elbios Feb 14, 2024

Choose a reason for hiding this comment

Uh oh!

Elbios Feb 14, 2024

Choose a reason for hiding this comment

Uh oh!

cmp-nct Feb 14, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Elbios commented Feb 14, 2024 •

edited

Loading