Skip to content

Support Git LFS with opt-in#16143

Merged
konstin merged 11 commits intoastral-sh:mainfrom
samypr100:conditional_lfs_support
Dec 2, 2025
Merged

Support Git LFS with opt-in#16143
konstin merged 11 commits intoastral-sh:mainfrom
samypr100:conditional_lfs_support

Conversation

@samypr100
Copy link
Collaborator

@samypr100 samypr100 commented Oct 6, 2025

Summary

Follow up to #15563
Closes #13485

This is a first-pass at adding support for conditional support for Git LFS between git sources, initial feedback welcome.

e.g.

[tool.uv.sources]
test-lfs-repo = { git = "https://github.com/zanieb/test-lfs-repo.git", lfs = true }

For context previously a user had to set UV_GIT_LFS to have uv fetch lfs objects on git sources. This env var was all or nothing, meaning you must always have it set to get consistent behavior and it applied to all git sources. If you fetched lfs objects at a revision and then turned off lfs (or vice versa), the git db, corresponding checkout lfs artifacts would not be updated properly. Similarly, when git source distributions were built, there would be no distinction between sources with lfs and without lfs. Hence, it could corrupt the git, sdist, and archive caches.

In order to support some sources being LFS enabled and other not, this PR adds a stateful layer roughly similar to how subdirectory works but for lfs since the git database, the checkouts and the corresponding caching layers needed to be LFS aware (requested vs installed). The caches also had to isolated and treated entirely separate when handling LFS sources.

Summary

  • Adds lfs = true or lfs = false to git sources in pyproject.toml
  • Added lfs=true query param / fragments to most relevant url structs (not parsed as user input)
    • In the case of uv add / uv tool, --lfs is supported instead
    • UV_GIT_LFS environment variable support is still functional for non-project entrypoints (e.g. uv pip)
  • direct-url.json now has an custom git_lfs entry under VcsInfo (note, this is not in the spec currently -- see caveats).
  • git database and checkouts have an different cache key as the sources should be treated effectively different for the same rev.
  • sdists cache also differ in the cache key of a built distribution if it was built using LFS enabled revisions to distinguish between non-LFS same revisions. This ensures the strong assumption for archive-v0 that an unpacked revision "doesn't change sources" stays valid.

Caveats

  • pylock.toml import support has not been added via git_lfs=true, going through the spec it wasn't clear to me it's something we'd support outside of the env var (for now).
  • direct-url struct was modified by adding a non-standard git_lfs field under VcsInfo which may be undersirable although the PEP 610 does say Additional fields that would be necessary to support such VCS SHOULD be prefixed with the VCS command name which could be interpret this change as ok.
  • There will be a slight lockfile and cache churn for users that use UV_GIT_LFS as all git lockfile entries will get a lfs=true fragment. The cache version does not need an update, but LFS sources will get their own namespace under git-v0 and sdist-v9/git hence a cache-miss will occur once but this can be sufficient to label this as breaking for workflows always setting UV_GIT_LFS.

Test Plan

Some initial tests were added. More tests likely to follow as we reach consensus on a final approach.

For IT test, we may want to move to use a repo under astral namespace in order to test lfs functionality.

Manual testing was done for common pathological cases like killing LFS fetch mid-way, uninstalling LFS after installing an sdist with it and reinstalling, fetching LFS artifacts in different commits, etc.

PSA: Please ignore the docker build failures as its related to depot OIDC issues.

@samypr100 samypr100 force-pushed the conditional_lfs_support branch 8 times, most recently from b34deec to 52cbd5c Compare October 13, 2025 19:37
@samypr100 samypr100 force-pushed the conditional_lfs_support branch 2 times, most recently from 6b19653 to 17d5d16 Compare October 17, 2025 21:51
@codspeed-hq
Copy link

codspeed-hq bot commented Oct 17, 2025

CodSpeed Performance Report

Merging #16143 will not alter performance

Comparing samypr100:conditional_lfs_support (dc1b6ba) with main (5947fb0)

Summary

✅ 6 untouched

@samypr100 samypr100 force-pushed the conditional_lfs_support branch 6 times, most recently from ce1cfb1 to f222e06 Compare October 21, 2025 15:55
@samypr100 samypr100 marked this pull request as ready for review October 21, 2025 15:59
@samypr100 samypr100 force-pushed the conditional_lfs_support branch 11 times, most recently from 8d3c6f0 to 50cf0e5 Compare October 29, 2025 16:20
@samypr100 samypr100 force-pushed the conditional_lfs_support branch from 50cf0e5 to 81d1024 Compare October 29, 2025 22:00
@samypr100 samypr100 force-pushed the conditional_lfs_support branch from 3c0b7fb to 63883ef Compare November 14, 2025 05:14
@samypr100 samypr100 added the test:macos Enable macOS tests for a pull request label Nov 14, 2025
@samypr100 samypr100 force-pushed the conditional_lfs_support branch 3 times, most recently from b0ab2da to 38b3990 Compare November 15, 2025 00:43
Copy link
Contributor

@geofft geofft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, and sorry for the delay in the re-review. I'm not totally following some of the code changes here but the behavior does seem to match what we want, i.e., sharing the db/* directory and only having separate checkouts/*/* directories. I might take a look at that later and send you something for review but that doesn't block merging this.

@geofft
Copy link
Contributor

geofft commented Nov 18, 2025

There is one theoretical backwards-incompatibility here: with current uv, you only have to specify UV_GIT_LFS=1 once and the checkout and built wheel gets cached with the LFS artifacts included. This PR (correctly) identifies that as a risk to cache consistency, but I think in practice it may work fine for users with git-lfs installed and configured properly, because the .ok file won't get created if the LFS fetch fails or otherwise leaves you in an inconsistent state, and I am worried that people might be relying on that in practice.

With current uv:

$ UV_CACHE_DIR=/tmp/cache/q2-old UV_GIT_LFS=1 uvx git+https://github.com/astral-sh/lfs-cowsay -v
    Updated https://github.com/astral-sh/lfs-cowsay (44b8e65c8ecd376e4482a68b29312227879a226d)
      Built lfs-cowsay @ git+https://github.com/astral-sh/lfs-cowsay@44b8e65c8ecd376e4482a68b29312227879a226d
Installed 1 package in 2ms
6.1
$ UV_CACHE_DIR=/tmp/cache/q2-old uvx git+https://github.com/astral-sh/lfs-cowsay -v
6.1

With the code from this PR:

$ UV_CACHE_DIR=/tmp/cache/q2-new UV_GIT_LFS=1 target/debug/uvx git+https://github.com/astral-sh/lfs-cowsay -v
    Updated https://github.com/astral-sh/lfs-cowsay (44b8e65c8ecd376e4482a68b29312227879a226d)
      Built lfs-cowsay @ git+https://github.com/astral-sh/lfs-cowsay@44b8e65c8ecd376e4482a68b29312227879a226d#lfs=true
Installed 1 package in 19ms
6.1
$ UV_CACHE_DIR=/tmp/cache/q2-new target/debug/uvx git+https://github.com/astral-sh/lfs-cowsay -v
      Built lfs-cowsay @ git+https://github.com/astral-sh/lfs-cowsay@44b8e65c8ecd376e4482a68b29312227879a226d
Installed 1 package in 18ms
Traceback (most recent call last):
  File "/tmp/cache/q2-new/archive-v0/UHbaC9QcSG9WzgquNtN5V/bin/lfs-cowsay", line 6, in <module>
    from lfs_cowsay.__main__ import cli
  File "/tmp/cache/q2-new/archive-v0/UHbaC9QcSG9WzgquNtN5V/lib/python3.14/site-packages/lfs_cowsay/__init__.py", line 4, in <module>
    from .main import (
    ...<13 lines>...
    )
  File "/tmp/cache/q2-new/archive-v0/UHbaC9QcSG9WzgquNtN5V/lib/python3.14/site-packages/lfs_cowsay/main.py", line 4, in <module>
    from .characters import CHARS
  File "/tmp/cache/q2-new/archive-v0/UHbaC9QcSG9WzgquNtN5V/lib/python3.14/site-packages/lfs_cowsay/characters.py", line 2
    oid sha256:4a003717d1e1997c56ef4128f54eb92c72831e44cda4e3d49e93a32ae0cc9222
               ^
SyntaxError: invalid decimal literal

(This does not affect UV_GIT_LFS=1 uv tool install blah + uv tool upgrade, though - we correctly persist the ?lfs=true in the URL. It only affects two independent parses of the same git repo/commit that do not share persistent state but share the cache.)

Again, I don't think this is a problem in practice / don't think this should block merging.

Copy link
Member

@konstin konstin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@Red-Eyed
Copy link

Red-Eyed commented Dec 3, 2025

Thank you!
Having worked with LFS, and I wonder how to deal with different LFS endpoints and credentials?
For example, at work we have GitHub enterprise and external git LFS server (not github)

@samypr100
Copy link
Collaborator Author

Thank you! Having worked with LFS, and I wonder how to deal with different LFS endpoints and credentials? For example, at work we have GitHub enterprise and external git LFS server (not github)

@Red-Eyed That would still be configured as part of your git configuration. Since LFS objects have the originating url on them, as long as git itself can authenticate, fetch, and materialize the objects you shouldn't have issues with uv.

@gsemet
Copy link

gsemet commented Dec 6, 2025

Great ! Is there a way to use it when doing « uv tool install git+https//… »?

@samypr100
Copy link
Collaborator Author

Great ! Is there a way to use it when doing « uv tool install git+https//… »?

uv tool install --lfs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or improvement to existing functionality test:macos Enable macOS tests for a pull request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make UV_GIT_LFS active by default on some url

7 participants