Skip to content

Optimize pushes for multiple refs#3978

Merged
bk2204 merged 1 commit intogit-lfs:masterfrom
bk2204:multi-left-optimization
Jan 16, 2020
Merged

Optimize pushes for multiple refs#3978
bk2204 merged 1 commit intogit-lfs:masterfrom
bk2204:multi-left-optimization

Conversation

@bk2204
Copy link
Member

@bk2204 bk2204 commented Jan 10, 2020

This PR optimizes pushes for multiple refs by using all the remote sides as bases for crawling git rev-list.

Fixes #3976.

@bk2204 bk2204 force-pushed the multi-left-optimization branch from 0bd8e6e to 012baa7 Compare January 14, 2020 19:51
@bk2204 bk2204 requested a review from a team January 14, 2020 19:52
@bk2204 bk2204 marked this pull request as ready for review January 14, 2020 19:52
@bk2204 bk2204 changed the title WIP: Optimize pushes for multiple refs Optimize pushes for multiple refs Jan 14, 2020
@bk2204 bk2204 force-pushed the multi-left-optimization branch 2 times, most recently from e86c59a to 8e32713 Compare January 15, 2020 16:03
Copy link
Member

@chrisd8088 chrisd8088 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

}

// ScanMultiRangeToRemote scans through all commits starting at the left ref but
// not including the right ref (if given)that the given remote does not have.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: might add a space after (if given), but I notice the preceding comment is the same, so maybe not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I copied and pasted, so will fix. Thanks for noticing.

When pushing multiple refs, we know any Git objects on the remote side
can be excluded from the objects that refer to LFS objects we need to
push, since if the remote side already has the Git objects, it should
have the corresponding LFS objects as well.

However, when traversing Git objects, we traverse them on a per-ref
basis, which is required since any LFS objects which spawn a batch
request will need the ref to be placed in the batch request as part of
the protocol.

Let's find a list of all the remote sides that exist before traversing
any Git objects, and exclude traversing any of those objects in any
traversal.  As a result, we can traverse far, far fewer objects,
especially when pushing new refs in a large repository.

Note that we exclude the case when the left and right sides are the same
because our code sets them to the same thing in some cases even though
Git does not, so we cannot reason about the values in that case.
@bk2204 bk2204 force-pushed the multi-left-optimization branch from 8e32713 to d0e950d Compare January 16, 2020 20:20
@bk2204 bk2204 merged commit 0ea89d0 into git-lfs:master Jan 16, 2020
@bk2204 bk2204 deleted the multi-left-optimization branch January 16, 2020 21:26
@hamj953
Copy link

hamj953 commented Feb 5, 2020

bk2204:multi-left-optimization

chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Feb 6, 2022
We can remove some unused code which determined the "left"
and "right" refs to pass to ScanMultiRangeToRemote() when
uploading, as it was obviated by the change in commit
d0e950d in PR git-lfs#3978 to the
use of a set of refs ("bases") from a single "right" ref.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Feb 7, 2022
We can remove some unused code which determined the "left"
and "right" refs to pass to ScanMultiRangeToRemote() when
uploading, as it was obviated by the change in commit
d0e950d in PR git-lfs#3978 to the
use of a set of refs ("bases") from a single "right" ref.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Feb 7, 2022
We can remove some unused code which determined the "left"
and "right" refs to pass to ScanMultiRangeToRemote() when
uploading, as it was obviated by the change in commit
d0e950d in PR git-lfs#3978 to the
use of a set of refs ("bases") from a single "right" ref.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Jun 6, 2023
In commit d0e950d of PR git-lfs#3978 the
ScanMultiRangeToRemote() method of the GitScanner structure was added,
and the sole caller of the existing ScanRangeToRemote() method was
revised to call the new scan method instead.

As no additional callers of the existing method have been introduced
since then, we can simplify our code by removing the unused method now.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Jun 7, 2023
In commit d0e950d of PR git-lfs#3978 the
ScanMultiRangeToRemote() method of the GitScanner structure was added,
and the sole caller of the existing ScanRangeToRemote() method was
revised to call the new scan method instead.

As no additional callers of the existing method have been introduced
since then, we can simplify our code by removing the unused method now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

git push <tag> performance slowdown due to scanning the entire commit history for LFS objects

3 participants