Fix migrate import speed regression#4813
Conversation
86f3671 to
9edcbc7
Compare
chrisd8088
left a comment
There was a problem hiding this comment.
This looks to be an incredibly tidy and elegant solution to a long-standing problem, along with reversing the regression. Wow, thank you! 🙇
I believe this may also resolve #4167, from my own testing. I don't know if it's worth adding another test case, though; I think the one you've added here might be sufficient, although it deals with copied files, not moved ones.
One note: I notice that the commit description and this PR's description mention PR #4176, but I think the PR in question is actually #4671. It might be good to update at least the commit description, for future clarity. (Note that that's different again from #4167 I mentioned above, which is a still-open bug report.)
Again, thank you for this. ❤️
When we cache files, do so on the full path instead of just the directory entry. This means that when we have an identical file with the same name in two different direectories, we distinguish between the two paths and ensure both are added to .gitattributes. This is an alternate solution to git-lfs#4671 which should perform better. For compmarison, with a clone of Git's main repository with the following command, we get: git lfs migrate import --everything --include="*.h": * v3.0.1 (broken): 608s user, 53s system, 5:34 total * v3.0.2 (fixed): 13435s user, 1255s system, 1:43:17 total * this commit (fixed): 716s user, 67s system, 6:59 total This is a much better performance characteristic for equivalent results. Preserve the integration from the earlier attempt at fixing this plus add an additional one. Avoid using assert_pointer in the new test because that helper doesn't always work correctly when there are two files with the same file name.
9edcbc7 to
0d31bce
Compare
When we cache files, do so on the full path instead of just the directory entry. This means that when we have an identical file with the same name in two different direectories, we distinguish between the two paths and ensure both are added to .gitattributes.
This is an alternate solution to #4671 which should perform better. For compmarison, with a clone of Git's main repository with the following command, we get:
git lfs migrate import --everything --include="*.h":This is a much better performance characteristic for equivalent results.
Preserve the integration from the earlier attempt at fixing this plus add an additional one. Avoid using
assert_pointerin the new test because that helper doesn't always work correctly when there are two files with the same file name.Fixes #4750