Git LFS stores all objects in a per-repository directory (.git/lfs/objects/).
The lfs.storage setting allows multiple repositories to share a single storage
directory, but this creates a safety problem: git lfs prune cannot determine
which repositories depend on which objects, so pruning risks data loss from
repositories that have not yet pushed their objects to a remote.
For users with many clones of the same or related repositories (e.g., CI workers, developers working on forks), significant disk space is wasted storing duplicate copies of the same LFS objects.
This is a long-standing request in the git-lfs project. Related issues include:
- #4530: Sharing
lfs.storagelocations between distinct repositories — proposes safeguards or per-repo subdirectories under a shared root, but does not separate downloaded objects from locally-created ones. - #1875: Common object store — requests shared local object storage across repos.
- #2147: Disable local LFS cache — requests avoiding duplication of large objects.
The existing lfs.storage option conflates the cache with the repo's storage,
with no concept of a safe-to-discard, read-through cache layer. The
git lfs prune command exists to evict objects that are pushed to the remote
server, but it is not run automatically, and it is unsafe if multiple
repositories share the same storage. This proposal addresses that gap.
The solution separates LFS objects into two tiers:
-
Per-repository storage (
.git/lfs/objects/): Objects created locally throughgit add. These may not yet be on any remote server and are not safe to remove until pushed. -
Shared cache (
lfs.cachedir): Objects downloaded from remote servers. Every object in this directory came from a remote and can be re-downloaded, making it always safe to evict.
Locally created objects (git add / clean filter):
working tree -> clean filter -> per-repo .git/lfs/objects/
Downloaded objects (checkout / fetch / pull):
remote server -> shared cache lfs.cachedir/objects/
After push (when lfs.cachetransferonpush is enabled):
per-repo .git/lfs/objects/ -> push to remote -> move to shared cache
When Git LFS needs to read an object (smudge, upload, etc.), it checks:
- Per-repository storage first
- Shared cache second
- Remote server (download) if not found locally
This priority ensures that locally-created objects always take precedence.
The shared cache is always safe to prune or evict entirely because:
- Every object in the cache was either downloaded from a remote server or moved there after a successful push.
- Git LFS already handles missing objects gracefully:
git statusandgit diffdo not read LFS object files (they work with pointer blobs in the git object database).git addreads from the working tree, not LFS storage.- The smudge filter writes the pointer back when an object is missing and reports "content not local, use fetch to download."
- Files already checked out in the working tree are unaffected by cache eviction.
The only consequence of cache eviction is that checkout or restore operations may need to re-download objects from the remote, which is the same behavior as when a checkout is created from an old reference that contains LFS objects that have not yet been downloaded.
git lfs prune operates only on per-repository storage and does not touch the
shared cache. This is correct because:
- The per-repo prune logic (retain unpushed, recent refs, worktrees, stash) only applies to objects that might not be on a remote yet.
- Cache objects are, by definition, already on a remote.
lfs.storage: The shared cache is independent of lfs.storage. When both
are set, lfs.storage controls where locally-created objects go, and
lfs.cachedir controls where downloads go.
Reference directories (GIT_ALTERNATE_OBJECT_DIRECTORIES,
.git/objects/info/alternates): The existing LinkOrCopyFromReference
mechanism copies objects from alternates into per-repo storage. This is correct
for git clone --reference where the referenced repo may contain unpushed
objects. The shared cache is a separate, complementary mechanism that reads
directly from the cache without copying.
Note: this section describes a possible eviction implementation. The proposal can first be discussed on its merits and on the features it provides to users, even if the actual internal implementation may follow a different path later.
Credits: this high-level eviction strategy takes inspiration from ccache's cache management approach, but the proposed implementation is original: it relies on a single memory-mapped file with atomic counters, instead of multiple files with locking operations.
By default, the shared cache grows without limit. When lfs.cachemaxsize or
lfs.cachemaxfiles is set, the cache uses approximate LRU (Least Recently Used)
eviction via the cacheevict module:
- A persistent
.cache-sizesfile in the cache objects directory tracks total size, file count, and per-partition counters using mmap'd atomic operations. - After each download completes, eviction is triggered if limits are exceeded.
The tracking hooks are centralized in the common transfer adapter base layer,
so all download methods (basic HTTP, SSH, custom transfer agents) are covered
automatically. Objects moved to the cache by other paths (post-push via
lfs.cachetransferonpush, orgit lfs prune --move-to-cache) are also tracked. The partition with the most files is selected and its oldest files (by mtime) are deleted until the partition is within budget. - Only one partition is processed per automatic eviction pass. By default 256 partitions are used, so only 1/256 of the objects in the cache are listed and processed.
- Hit/miss statistics are tracked: cache hits (objects read from cache) and
misses (objects downloaded) are visible via
git lfs env. - Reading a cached object updates its mtime, keeping frequently-used objects from being evicted.
The eviction handler is cross-process safe: multiple git-lfs processes sharing the same cache coordinate via atomic operations on the mmap'd file, with a non-blocking eviction lock that detects dead processes.
The cacheevict module is only activated when at least one limit is configured.
When neither lfs.cachemaxsize nor lfs.cachemaxfiles is set (the default),
no mmap file is created and the shared cache operates without any eviction
or statistics tracking.
Note: the mmap'd coordination file is designed for single-machine use. A cache
directory on a network filesystem (NFS, CIFS) works fine when accessed from a
single machine, but the eviction limits must not be set if the same cache
directory is shared between multiple machines, as cross-machine mmap coherency
is not guaranteed. In that case, cache eviction should be handled externally, or
by running git lfs cache trim periodically.
Multiple processes may read from and write to the shared cache simultaneously. Safety is ensured by:
- Downloads use a temporary file in the cache's
tmp/directory, then atomically rename to the final location. If two processes download the same object, the second rename either succeeds (overwriting with identical content) or silently succeeds if the file already exists. - The cache temp directory is on the same filesystem as the cache objects
directory, ensuring
os.Renameis atomic. - The eviction handler uses lock-free atomic counters for size and file tracking, and a non-blocking CAS lock for eviction coordination.
- Reads are naturally safe since LFS objects are immutable.
Absolute path to a shared cache directory for downloaded LFS objects. Multiple repositories may share the same cache directory. When not set, all objects are stored in per-repository storage as before.
The cache uses the same directory layout as per-repo storage:
lfs.cachedir/
objects/
ab/
cd/
abcd1234... (full OID)
tmp/ (download temp files)
Boolean, default false. When true and lfs.cachedir is set, objects that
are successfully pushed to a remote are moved from per-repo storage into the
shared cache.
This is useful for CI workers that continuously push data and would otherwise accumulate unbounded per-repo storage.
The default value is set to false to maintain git's inherent distributed
backup property: even if the remote git server is lost, all LFS files would
have been uploaded by at least one developer of the team, and that developer's
copy is only removed from per-repo storage when they explicitly opt in (via
git lfs prune, or by setting this value to true to automate the move on
every push).
See "Automatic eviction" section.
To move existing LFS objects from per-repo storage into the shared cache,
use git lfs prune --move-to-cache. This reuses the standard prune retain
logic: objects that are referenced by current/recent/unpushed refs are kept
in per-repo storage, while everything else is moved to the cache.
To move all pushed objects (leaving only unpushed/stashed/index objects in per-repo storage):
git lfs prune --move-to-cache --force
Use --dry-run to preview what would be moved without making changes.
The git lfs cache command provides direct cache management:
git lfs cache stats— show cache size, file count, and hit/miss statisticsgit lfs cache clear— remove all cached filesgit lfs cache trim— remove oldest files by size, count, or age limits
These commands work by scanning the directory tree and do not require the mmap-based eviction handler. This makes them suitable for caches on network filesystems or in multi-machine environments where automatic eviction cannot be enabled.
When no flags are passed, trim falls back to the configured
lfs.cachemaxsize/lfs.cachemaxfiles limits, making it suitable as a
periodic cron job for full-rescan trimming.
When lfs.cachedir is configured per-repo (not globally), a local clone
(git clone --local, --shared, or --reference) will not inherit the
setting. Downloaded objects that only exist in the source repo's cache will
not be found by the new clone, and checkout will skip those files (writing
pointers instead, as with any missing LFS object).
Setting lfs.cachedir in the global git config (~/.gitconfig) avoids this
entirely. For per-repo configurations, two optional mechanisms are proposed
(implemented in separate commits so that either can be included or reverted
independently):
Git creates an alternates file pointing to the source repo's objects
directory. Git LFS can read the source repo's lfs.cachedir config and
add its objects directory to the reference dirs, allowing
LinkOrCopyFromReference to find and copy cached objects into the new
clone's per-repo storage.
This is non-invasive (no config changes to the new clone) but only works while the alternates file exists, and copies objects into per-repo storage rather than reading directly from the cache.
On the first git-lfs command in the new clone, if lfs.cachedir is not
already configured, git-lfs reads the source repo's setting from the
alternates and sets it in the new clone's local git config. This gives the
new clone full shared cache support permanently.
This modifies the new clone's .git/config automatically, which may be
unexpected in some workflows. It is idempotent: it only sets the config if
not already configured at any level.
Support for the shared cache feature as described in this proposal is implemented in an experimental git-lfs version for testing and is pending submission upstream after this proposal is approved.
It is available at:
It includes more changes than necessary for this proposal. The quality of the implementation is not adequate for submission as-is. The code changes, tests and documentation were created with the assistance of AI agents. The design, implementation, and tests were directed and reviewed by human developers, but only partially for now. It should be considered experimental only. It may be modified or rewritten based on received feedback on the proposal.