Git-Xet: LFS custom transfer agent with Xet protocol#425
Conversation
jgodlew
left a comment
There was a problem hiding this comment.
Overall, looks good, I mostly have nits.
git_xet/src/auth.rs
Outdated
| // user for a password to fetch public repository data. | ||
| // 2. The Git Remote URL, which should be something like "https://git.com/repo.git" This URL is used for the Git | ||
| // Credential Helper. This way existing https Git remote credentials can be re-used for LFS. | ||
| pub fn get_creds(repo: &GitRepo, remote_url: &GitUrl, operation: Operation) -> Result<Arc<dyn CredentialHelper>> { |
There was a problem hiding this comment.
should we have debug/info logs for the selected credential helper?
| } | ||
|
|
||
| // 6. check Git credential helper | ||
| Ok(GitCredentialHelper::new(repo.git_path()?, &derived_host_url)?) |
There was a problem hiding this comment.
in case this fails as in the git credential returns nothing, maybe log that we failed to use the git credential command.
| struct CliOverrides { | ||
| /// Increase verbosity of output (-v, -vv, etc.) | ||
| #[clap(long, short = 'v', action = ArgAction::Count)] | ||
| pub verbose: u8, |
There was a problem hiding this comment.
we need to set up logging somewhere, unless it's set up already? we should also make distinction for if we respect RUST_LOG over the -v flag
also set a max/min maybe? I'm guessing it's:
0 - error
1 - warn
2 - info
3 - debug
4 - trace
so min 0, max 4
| } | ||
|
|
||
| async fn terminate(&mut self) -> Result<()> { | ||
| Ok(()) |
There was a problem hiding this comment.
Should we log something here? Is terminate() expected to be called by someone? I am wondering if a debug/trace log message makes sense here.
rajatarya
left a comment
There was a problem hiding this comment.
This is overall looking good. Most of my comments are to add documentation/comments and maybe restructure code for easier readability.
The child process part needs some additional testing.
Great work @seanses! This is a big feature and it is great to see its implementation really close to completed!
hoytak
left a comment
There was a problem hiding this comment.
Some minor things, but overall looks good to me. I'm assuming it works overall. I suspect there will be a lot to tweak when it gets used more widely, but it looks solid.
rajatarya
left a comment
There was a problem hiding this comment.
Looking good! I think this is ready to merge now, and any suggestions could come in subsequent PRs.
Love the comments added and the additional tests.
This PR builds a Git integration called
git-xetthat enables users to upload files using the Xet protocol as part of a standard git push.This integration builds on the Git LFS custom transfer adapter protocol, the same mechanism we now use to handle Git LFS uploads for files larger than 5 GB through multipart PUT.
To enable uploads to Xet, users run
git-xet install, which writes the following configuration to the Git config file at a selected scope [--system,--global(default), or--local]:This setup registers a new transfer adapter named xet, allowing Git to delegate LFS file transfers to the git-xet binary when applicable.
On the server side, support is rolled out in two stages:
Stage 1 (Upload): The Git LFS batch API for the "upload" operation is updated.
Stage 2 (Download): The same logic is applied to the Git LFS batch API for the "download" operation.
Server side changes are at https://github.com/huggingface-internal/moon-landing/pull/14572