Disable DiskCache in hf_xet, continue to use it in git_xet#535
Merged
Disable DiskCache in hf_xet, continue to use it in git_xet#535
Conversation
- random eviction - implements ChunkCache trait - hf-xet default chunk_cache is 0 bytes - MemoryCache default size is 20% of system RAM, but configurable
- git_xet uses DiskCache with default 10GB disk cache - hf_xet uses MemoryCache with default size being 20% of system RAM
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a MemoryCache implementation for the ChunkCache trait and configures hf_xet to use RAM-based caching (20% of system RAM) instead of disk-based caching, while git_xet continues using DiskCache (10GB).
Key changes:
- Implements
MemoryCachewith LRU-style eviction and configurable capacity based on system RAM percentage - Routes cache strategy selection through the
cache_sizeconfiguration parameter (0 = MemoryCache, >0 = DiskCache) - Refactors disk cache constants to distinguish between overall capacity and per-file limits
Reviewed Changes
Copilot reviewed 9 out of 11 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| chunk_cache/src/memory.rs | New MemoryCache implementation with random eviction policy |
| chunk_cache/src/lib.rs | Exports MemoryCache and adds MEMORY_CACHE_PERCENTAGE configuration |
| chunk_cache/src/disk.rs | Refactors cache size constants and updates import statements |
| chunk_cache/Cargo.toml | Adds sysinfo dependency for system memory detection |
| cas_client/src/remote_client.rs | Adds with_cache() constructor for custom cache injection |
| cas_client/src/lib.rs | Re-exports MemoryCache types and constants |
| data/src/remote_client_interface.rs | Implements cache strategy selection logic based on cache_size |
| data/Cargo.toml | Adds chunk_cache dependency |
| hf_xet/src/lib.rs | Sets HF_XET_CHUNK_CACHE_SIZE_BYTES=0 to enable MemoryCache |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
seanses
reviewed
Oct 23, 2025
Collaborator
Author
hoytak
added a commit
that referenced
this pull request
Nov 10, 2025
This PR disables the disk cache by default in hf_xet using cargo features instead of in-code logic. Reverts #535
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
cc: @assafvayner - would welcome your feedback but I know you are out this week.
In this PR:
2. hf_xet : disables DiskCache by default.
3. git_xet : continues to use DiskCache by default, set to 10GB as before.
I am testing it manually while putting up the review - so might need more commits to get it fully working. Right now all the unit-tests pass but I haven't verified the functionality with manual testing yet.