Recurrent mmproj img ctx cache, even cache in disk/the disk load not working now#1585
Closed
FNsi wants to merge 4 commits into
Closed
Recurrent mmproj img ctx cache, even cache in disk/the disk load not working now#1585FNsi wants to merge 4 commits into
FNsi wants to merge 4 commits into
Conversation
cache up to 10 img in ram, rest in disk. ---- Description of changes: 1. Two-tier cache system implemented: - RAM cache: keeps max 10 entries for fast access - Disk cache: saves older entries to /tmp/llama_mmproj_cache/ when RAM is full 2. Key files modified: - mtmd.h - added KV cache API functions - mtmd.cpp - implemented cache storage and disk I/O - mtmd-helper.cpp - added save cache after decoding - server-context.cpp - added restore cache before processing - common.h - added configuration parameters
Author
|
I was wondering to set to the grey one but seems I cannot find the option... |
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes: indeed store up tp 10 imgs ctx cache in ram; the rest store in disk and load when needed.;
Two-tier cache system implemented:
RAM cache: keeps max 10 entries for fast access
Disk cache: now save older entries to /tmp/llama_mmproj_cache/ if ram is full or entries > 10;
Key files modified:
mtmd.h - added KV cache API functions
mtmd.cpp - implemented cache storage and disk I/O
mtmd-helper.cpp - added save cache after decoding
server-context.cpp - added restore cache before processing
common.h - added configuration parameters