-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Describe the bug
If a unique Git LFS object is added in a merge commit, certain Git LFS comments such as git lfs prune appear to treat it differently than other objects, such as by culling it during a prune operation even if it has not been pushed to any remote.
To Reproduce
$ git init mergetest
$ cd mergetest
$ git lfs track '*.bin'
$ git add .gitattributes
$ git commit -m init
$ git checkout -b branch
$ echo branch >foo.bin
$ git add foo.bin
$ git commit -m branch
$ git checkout main
$ echo main >foo.bin
$ git add foo.bin
$ git commit -m main
# will stop with a merge conflict
$ git merge branch
CONFLICT (add/add): Merge conflict in foo.bin
Auto-merging foo.bin
Automatic merge failed; fix conflicts and then commit the result.
Encountered 1 file(s) that should have been pointers, but weren't:
foo.bin
# create a unique object and add it in the merge commit
$ echo merge >foo.bin
$ git add foo.bin
$ git commit -m merge
# add an extra object so HEAD doesn't point to the merge commit, because
# "git lfs prune" always retains the current ref's objects
$ echo latest >foo.bin
$ git add foo.bin
$ git commit -m latest
$ git lfs prune
prune: 4 local object(s), 3 retained, done.
prune: Deleting objects: 100% (1/1), done.
$ git checkout HEAD^
Downloading foo.bin (6 B)
Error downloading object: foo.bin (ddc40d6): Smudge error: Error downloading foo.bin (ddc40d62ea7ad9b37db477569e3617bb57e2f639111752fa70a7fa41f91170a5): batch request: missing protocol: ""
Expected behavior
By contrast, if we follow the same steps above but resolve the merge conflict by reusing one of the existing Git LFS objects, nothing is pruned, which is what we would expect:
# after merge commit, re-create existing object and use it in the merge commit
$ echo main >foo.bin
$ git add foo.bin
$ git commit -m merge
# add an extra object so HEAD doesn't point to the merge commit, because
# "git lfs prune" always retains the current ref's objects
$ echo latest >foo.bin
$ git add foo.bin
$ git commit -m latest
$ git lfs prune
prune: 3 local object(s), 3 retained, done.
$ git checkout HEAD^
...
HEAD is now at f4496b4 merge
System environment
This was demonstrated on macOS, but any environment should be similar.
Output of git lfs env
git-lfs/2.13.3 (GitHub; darwin amd64; go 1.16.6; git a5e65851)
git version 2.33.0
LocalWorkingDir=.../mergetest
LocalGitDir=.../mergetest/.git
LocalGitStorageDir=.../mergetest/.git
LocalMediaDir=.../mergetest/.git/lfs/objects
LocalReferenceDirs=
TempDir=.../mergetest/.git/lfs/tmp
ConcurrentTransfers=8
TusTransfers=false
BasicTransfersOnly=false
SkipDownloadErrors=false
FetchRecentAlways=false
FetchRecentRefsDays=7
FetchRecentCommitsDays=0
FetchRecentRefsIncludeRemotes=true
PruneOffsetDays=3
PruneVerifyRemoteAlways=false
PruneRemoteName=origin
LfsStorageDir=.../mergetest/.git/lfs
AccessDownload=none
AccessUpload=none
DownloadTransfers=basic,lfs-standalone-file
UploadTransfers=basic,lfs-standalone-file
git config filter.lfs.process = "git-lfs filter-process"
git config filter.lfs.smudge = "git-lfs smudge -- %f"
git config filter.lfs.clean = "git-lfs clean -- %f"
Additional context
Using GIT_TRACE=1 we can see a bit more of what's going on in the unique object merge commit case:
$ GIT_TRACE=1 git lfs prune
...
15:25:56.717947 trace git-lfs: RETAIN: 715760eedeabb0ca7b5758d4536e78c4c06cad699caa912bf1ef0f483b103efc unpushed
15:25:56.718214 trace git-lfs: RETAIN: 6403203dd5a0867eb14d104ee8a73730bd72dd9ad92e78d996a6dba0a5dcfc01 unpushed
15:25:56.718802 trace git-lfs: RETAIN: 2bf74fd761df1710dae82a5459f89af15375f36e700c31156c991e22b9fe7edb unpushed
...
To scan for unpushed objects, the scanUnpushed() method uses a set of git log options which includes -p but not --first-parent or any --diff-merges variant. The docs for git log say:
Note that unless one of
--diff-mergesvariants (including short-m,-c, and--ccoptions) is explicitly given, merge commits will not show a diff, even if a diff format like--patchis selected, nor will they match search options like-S. The exception is when--first-parentis in use, in which casefirst-parentis the default format.
So although there is a regular expresssion which is used to scan the git log output for any diff --cc lines, as far as I can see, this has never actually been functional. (The expression and scanner were added in commit 90ce638 in PR #559.) Moreover, if we did supply the --cc/--diff-merges=dense-combined option to git log, the output would then include combined diffs, which the other regular expression that picks out Git LFS pointer data would not identify because it requires a single leading +, -, or space character, corresponding to the normal diff format.
We specifically used the -m/--diff-merges=on option when scanning for stashed refs in order to avoid the need to parse the combined diff format, as per commit 34b323d in PR #4209. Possibly something similar can be done in the more generic git log parsing case as well. If so, then we ought to also add tests which exercise this behaviour in multiple instances, presumably by making use of the apparent ability of the lfstest-testutils program to support the creation of text fixtures with merge commits with multiple parents. As far as I can tell, none of the test suite scripts utilize this functionality, though; they all just use the common idiom of "ParentBranches":["main"] with a single parent specified.
Note that because this log scanning code is used in several ways, such as in the logPreviousSHAs() method called by git lfs fetch, it may be possible to construct similar demonstration cases for that command which unexpectedly exclude or ignore Git LFS objects added in merge commits.