Conversation
technoweenie
left a comment
There was a problem hiding this comment.
Looking great. I pointed out a few things below:
commands/command_filter_process.go
Outdated
| status = delayedStausFromErr(err) | ||
| } else { | ||
| status = statusFromErr(err) | ||
| } |
There was a problem hiding this comment.
IMO, this nested if is complicated. Would something like this work? It looks like calling Flush() on nil is a no-op any way.
if delayed {
s.WriteStatus(delayedStausFromErr(err))
} else if ferr := w.Flush(); ferr != nil {
s.WriteStatus(statusFromErr(ferr))
} else {
s.WriteStatus(statusFromErr(err))
}Continuing to declare status outside of that scope is fine too, but it didn't look like it was needed anywhere else.
BTW delayedStausFromErr has a typo :)
There was a problem hiding this comment.
Way better -- thanks for this suggestion. Changed things up in 1001e1c (and I got that typo!)
commands/command_smudge.go
Outdated
| // | ||
| // delayedSmudge returns the number of bytes written, whether the checkout was | ||
| // delayed, the *lfs.Pointer that was smudged, and an error, if one occurred. | ||
| func delayedSmudge(s *git.FilterProcessScanner, to io.Writer, from io.Reader, tq *tq.TransferQueue, filename string, skip bool, filter *filepathfilter.Filter) (int64, bool, *lfs.Pointer, error) { |
There was a problem hiding this comment.
The tq parameter clobbers the tq package name. Not critical of course, but may be worth changing.
There was a problem hiding this comment.
Good call -- I think we should be fine for now, since we don't use any other exported members from the tq package, but this will be a good way to prevent an ambiguous diff later on if that changes. Covered this one in: a9942cd.
commands/command_filter_process.go
Outdated
| } | ||
| } | ||
| case "list_available_blobs": | ||
| closeOnce.Do(func() { go q.Wait() }) |
There was a problem hiding this comment.
I could use some clarification here. It's my understanding that a transfer queue can accept new transfers until Wait() is called. If I'm interpreting the protocol docs correctly, then the process should look like this:
- Git sends all pointers through
smudgewithcan-delay=1 - Git sends
list_available_blobs. The transfer queue is closed. - Git sends more
smudgecommands for each available blob, never callinglist_available_blobsagain. - Profit!!
There was a problem hiding this comment.
🤔 not quite -- I think the confusion has to do with the readAvailable() func. readAvailable does as many reads as possible before <-tq.Watch() becomes blocking, responding to a list_available_blobs command with any blobs that it currently has ready, or blocking until at least one blob is ready.
The process looks like this:
- Git sends all pointers through
command=smudgewithcan-delay=1. - Git issues a
list_available_blobscommand, indicating that no more new blobs will be sent as above. The transfer queue is accordingly closed. - We read as many currently available blobs as we can before blocking, or block until a) at least one blob becomes available, or b) the transfer queue has no more blobs to send, in which case we send an empty list, and the process is complete.
- Git send a
command=smudgewithcan-delay=0for each blob in the list above. - Git issues another
list_available_blobscommand. - See step 3.
I did my best to clarify this in 8816463.
| if !ok { | ||
| return ts | ||
| } | ||
| return append(ts, t) |
There was a problem hiding this comment.
Does this part need to exist? If ts is empty, the for should start another iteration that'll attempt to read from the channel. Very possible there's some detail about select that I'm forgetting...
There was a problem hiding this comment.
I think so, though the duplication is unfortunate. Specifically, this return append, could be turned into an append only, but the blocking <-ch read causes the enclosing for loop to not spin.
There was a problem hiding this comment.
Jeez, the return makes all the difference in the world. Thanks.
test/test-track.sh
Outdated
|
|
||
| if [ ! -d .git/info ]; then | ||
| mkdir .git/info | ||
| fi |
There was a problem hiding this comment.
FWIW I think you could replace this with mkdir -p .git/info.
There was a problem hiding this comment.
Good call -- I got this one in: bc376b8.
technoweenie
left a comment
There was a problem hiding this comment.
Really excited to see this land!
|
As suggested by @ttaylorr, I added a progress indicator for filtering content in Git and posted the patch to the Git mailing list. If you clone a repo it would look like this: If you checkout a commit that references many LFS files that you don't have locally on your machine yet, then you would see something like this: A nice side effect is that users can easily see the bandwidth performance of their LFS servers 🙂 |
|
There might be a bug in the current (7115b57) delay implementation. Try these steps: (1) Checkout a branch of a repo with files in LFS The checkout operation should be blazingly fast as all objects should be available locally already. However, it looks like as if Git LFS downloads all objects, again. |
@larsxschneider great catch, this was definitely a bug. With c04745c: EDIT: the above commit had a bug where it would not smudge pointers for which the local object was cached. I fixed that in: 50ecd67. |
commands/command_smudge.go
Outdated
| } | ||
|
|
||
| // Write 'statusFromErr(nil)', even though 'perr != nil', since | ||
| // we are about to write non-delayed smudged contents to "to". |
There was a problem hiding this comment.
The fix seems to work! Thank you 👍! However, I still try to decipher the comment 😄. perr should be nil here, no? TBH I don't really understand the same comment a few lines above either. Can you give me a hint?
There was a problem hiding this comment.
Plus, could it be that this was the reason for the degraded performance in the lfs.concurrenttransfers=1 case?
There was a problem hiding this comment.
Ah, good question -- I think the comment is out of date but the code is correct here: this case occurs when we have the contents locally and can smudge without delay. It is certainly possible that this could affect cases where lfs.concurrenttransfers=1, but I think this would be negligible. I fixed this up in 4cefa24.
|
Cloning repositories via SSH with a large number of LFS files seems to consistently result in a "batch response: Bad credentials" error: Plus, in case of an error Git LFS prints |
Hmm, I wouldn't think this PR would affect that. It's probably an unrelated issue. Can you try with an HTTPS remote (just to confirm whether this is a general transfer queue issue, an ssh auth issue, or a delay filter issue)? As for the SSH thing, my hunch is that the expiration timeout is not being respected. Is this happening at around the 10 minute mark? Two potential spots that I can think of:
I think we still need some output for legacy smudges. The filter should disable them (or even implement its own progress output). |
larsxschneider
left a comment
There was a problem hiding this comment.
I added a few nits that you can address if you want (most relevant from my point of view is the delayedStatusFromErr suggestion).
I am happy to see that merged 👍
|
🙇 thanks for the review, everybody! |
|
What git version do I need for this? Yet it doesn't seem to work, no concurrent transfers when I do For cloning the same repository: git lfs clone: git clone: |
|
@jjgod You need the upcoming Git version 2.15 😄 |
Clone speed is fine but no progress output and looks like hang: Speed: git lfs clone |
|
@larsxschneider Strange, seems like git/git@52f1d62e is merged to git 2.15 already. macOS output is without progress meter too. |
|
Yeah, that is indeed strange. However, I only noticed it on Windows: https://public-inbox.org/git/7A5EBFA0-3DC4-4AA1-91DB-450DC89BA8E7@gmail.com/
I am AFK for a two weeks but I'll investigate when I am back.
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
|
@larsxschneider, missed that thread. On your test repo Git and Git LFS are installed from Homebrew. |
|
@larsxschneider just updated to git 2.15.1 and still no progress showing on clone. Any updates? |
|
Thanks @DJm00n ! I can reproduce the issue with your repo! Let's see if I can find the problem... |
|
@DJm00n I think I found and fixed the issue. Let's see what the Git experts think: https://public-inbox.org/git/20171204203647.30546-1-lars.schneider@autodesk.com/ |
@larsxschneider Excellent find, and thanks for keeping us up-to-date. |
|
Sorry for the late question, but does this mean that by upgrading lfs & git that the speed will increase for |
That's correct! |
The "git lfs clone" command has been deprecated since Git LFS v2.3.0 and Git v2.15.0 were released, so we update our manual page for the command to reflect this fact and clarify when and why the command should still be used, if at all. As discussed in git-lfs#2466, Git's support for long-running filter processes (introduced in Git v2.11.0 via the filter.<driver>.process setting) was enhanced in v2.15.0 with the "delay" capability whereby the filter can signal that it will operate asynchronously on multiple blobs, thus allowing it to process multiple blobs in parallel. Git LFS was updated to make use of the "delay" capability in a series of PRs culminating in PR git-lfs#2511, all of which were included in the Git LFS v2.3.0 release. We also make some minor edits to the manual page to improve the formatting of literal strings.
The "git lfs clone" command has been deprecated since Git LFS v2.3.0 and Git v2.15.0 were released, so we update our manual page for the command to reflect this fact and clarify when and why the command should still be used, if at all. As discussed in git-lfs#2466, Git's support for long-running filter processes (introduced in Git v2.11.0 via the "filter.<driver>.process" setting) was enhanced in v2.15.0 with the "delay" capability whereby the filter can signal that it will operate asynchronously on multiple blobs, thus allowing it to process multiple blobs in parallel. Git LFS was updated to make use of the "delay" capability in a series of PRs culminating in PR git-lfs#2511, all of which were included in the Git LFS v2.3.0 release.
This pull requests teaches the
git-lfs-filter-process(1)command the 'delay' capability that was proposed by @larsxschneider in #1632 (comment) and added to Git in git/git@51b8aec.This implementation allows the process filter to respond with a
status=delayedwhen asked to smudge the contents of a file that it does not already have in the local.git/lfs/objectscache.If any items were delayed, Git will later ask for blobs that were available with the
list_available_blobscommand, which Git LFS will either respond with the immediately available blobs, wait until at least one becomes available, or respond with an empty set indicating that the transfer is complete.To implement this, I added a new
delayedSmudgehelper function, which behaves the same way assmudge(), but instead adds the pointer to*tq.TransferQueuegiven as an argument.Here are some benchmarks when running
git cloneon GitHub.com and a local copy of GitHub:lfs.concurrenttransfers=1...=3...=8...=16...=32v2.13.2v2.14.1.nextFor a delta of:
lfs.concurrenttransfers=1...=3...=8...=16...=32Excluding outliers:
v2.14.1.nextThere were three big performance changes that impact this improvement:
tq.(*TransferQueue).Add()function, which allows theTransferQueueto accept new items sent by Git, instead of waiting until the first batch of items were transferred.list_available_blobsuntiltq.Wait()returns. Instead, read fromtq.Watch()until a read becomes blocking (viaselect { case <-tq.Wait(): ... default: }. This allows Git to perform the checkout while thetq.(*TransferQueue)is still transferring items.Following this, I'd like to deprecate the
git lfs clonecommand, sincegit cloneis now as fast or faster thangit lfs cloneas of the latest tip ofgit.git.EDIT: the each number is the average of 5 runs of a given command, with the outliers removed. The
benchmarkcommand is written as: benchmark.sh. Theaverage-timingscommand is written as: average-timings.sh.Closes: #2466.
/cc @git-lfs/core
/cc @larsxschneider