Skip to content

prune: fix deleting objects referred to by stashes#4209

Merged
chrisd8088 merged 16 commits intogit-lfs:masterfrom
sinbad:prune-fix-stash-delete
Sep 4, 2020
Merged

prune: fix deleting objects referred to by stashes#4209
chrisd8088 merged 16 commits intogit-lfs:masterfrom
sinbad:prune-fix-stash-delete

Conversation

@sinbad
Copy link
Contributor

@sinbad sinbad commented Aug 7, 2020

It was impossible to pop a stash with LFS data successfully
after running any lfs prune command before this fix, because
prune would consider stashed data unreferenced.

This fixes #4206

It was impossible to pop a stash with LFS data successfully
after running any lfs prune command before this fix, because
prune would consider stashed data unreferenced.

This fixes git-lfs#4206
@sinbad
Copy link
Contributor Author

sinbad commented Aug 7, 2020

FYI: I wrote the git lfs prune command originally so this is my bad, hence why I'm fixing it 😃

@chrisd8088
Copy link
Member

Hey, thanks for both the bug report and the proposed patch! I was intending to look at the bug report today, and am delighted to have a potential solution in hand before I even do so! 😄

On a first glance this seems reasonable, but to get the CI jobs to pass I think we'll need to fix the trailing whitespace issue in t-prune.sh (I believe there's a missing final newline?)

Thanks again for the patch; that's always appreciated!

@sinbad
Copy link
Contributor Author

sinbad commented Aug 7, 2020

Ah yes, sorry I ran the tests individually locally rather than the full CI build, I forgot to check the trailing new line. Due to time zone I won’t be able to fix this until tomorrow but feel free to patch it up yourself if you want / have time. :)

@sinbad
Copy link
Contributor Author

sinbad commented Aug 8, 2020

Might need some help with this, I've tried to please the cibuild god but it keeps complaining about whitespace; I've added a couple of newlines but no luck. Its been a few years since I've had to bash my head against it so maybe there's a trick I've forgotten :)

Copy link
Member

@chrisd8088 chrisd8088 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for putting this together; I would have had to start from zero and this is so much better than that!

I noted a few things I discovered while poking around; see my inline comments for those ideas and concerns.

In addition, one other change we should include is an update to the man page where it lists the places it checks for "active" refs which should not be pruned.

Oh, and thank you also very much for including a test!! ❤️

Comment on lines +68 to +89
// First get the SHAs of all stashes
// git reflog show --format="%H" stash
reflogArgs := []string{"show", "--format=%H", "stash"}

cmd, err := git.RefLog(reflogArgs...)
if err != nil {
return err
}

cmd.Start()
defer cmd.Wait()

scanner := bufio.NewScanner(cmd.Stdout)

for scanner.Scan() {
err = s.ScanRef(strings.TrimSpace(scanner.Text()), cb)
if err != nil {
return err
}
}

return nil
Copy link
Member

@chrisd8088 chrisd8088 Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a bit of a deeper look, I had a few notes on this scanStashed() function.

The first thing which puzzled me for a while was that it seemed to succeed in a freshly initialized, empty repo with no stashes, where .git/refs/stash doesn't exist yet. I couldn't get git reflog show stash to work on the command line, but it seemed to be OK when run this way.

But the reason why is that the possible error exit code returned by cmd.Wait() is not being checked. If you add to the end of this function something like:

        stderr, _ := ioutil.ReadAll(cmd.Stderr)
        err = cmd.Wait()
        if err != nil {
                return fmt.Errorf("error in git reflog: %v %v", err, string(stderr))
        }

then it does fail in a repo where there's no refs/stash ref, or if you change "stash" to "foobar" or something. So that's the first issue, that the Git command's exit code is not checked.

Looking at how the git stash list command works, I believe it starts by checking for an extant refs/stash ref, and exiting early if it doesn't exist.

So we probably need something similar here once we're checking the Git command's exit code; otherwise we'll bail with Prune error: error in git reflog: exit status 128 fatal: ambiguous argument 'stash': unknown revision or path not in the working tree. (That's what I get once I added the check above on cmd.Wait()'s return value, plus failures in the t/t-prune.sh and other tests.)

Beyond that, I looking again at how git stash list uses the git log -g refs/stash -- command, I wondered if we could avoid the need for a new git.RefLog() function in git/git.go entirely and just use git.Log()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worked up a tentative first half of an alternate version of this function, which checks for refs/stash first, and then uses git.Log(). The hard part, which I skipped for now, is reading the output; I thought that could borrow from the design of parseScannerLogOutput() but without the need for a direction parameter.

func scanStashed(cb GitScannerFoundPointer, s *GitScanner) error {
        if _, err := git.ResolveRef("refs/stash"); err != nil {
                return nil
        }

        logArgs := []string{"-g"}

        // Add standard search args to find lfs references
        logArgs = append(logArgs, logLfsSearchArgs...)
        // Search reflog (with -g) for refs/stash
        logArgs = append(logArgs, "refs/stash", "--")

        cmd, err := git.Log(logArgs...)
        if err != nil {
                return err
        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, personally, I think the original version is more elegant - I know it's not checking the result code, but this is a pattern that already exists in the codebase (see RemoteList, RemoteURLs, RecentBranches). The only reason for this command to fail (and others to not) is that there are no stashes to list.

I didn't use git log because there's no need to "walk" the history - all stashes are singular commits. reflog is more efficient to get the data needed and I thought the RefLog command might be useful later on. But I don't mind if you'd prefer to write it another way.

Copy link
Member

@chrisd8088 chrisd8088 Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see the similarity to git.RemoteList() and friends now; I had not thought to look at those, only at the other functions nearby. Well, given that, maybe there's a tidy way to combine the two approaches and get a minimal set of changes?

Here's what seems to be working for me tonight, replacing just the git reflog show with git log -g, i.e., something like:

	// git log -g --format="%H" refs/stash --
	logArgs := []string{"-g", "--format=%H", "refs/stash", "--"}

	cmd, err := git.Log(logArgs...)

I can't claim to be an expert on the exact functional differences between those two commands (but I'll ask an expert :-) ... however, if they're effectively the same for this purpose, that would allow for reuse of the git.Log() function and avoid the need to add git.RefLog(). Although I confess this is a very small thing, really; I could go either way (particularly if it turns out there's a specific advantage to the git reflog command)!

Thanks again so much for working on this; it's very much appreciated.

Copy link
Contributor Author

@sinbad sinbad Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK you're right I think -g is basically an alias for reflog and makes log behave very differently. Unfortunately for the same reason -g makes some of the other options in logLfsSearchArgs to git log stop working, such as the -G "oid sha256:" search, so we can't combine everything into one git log and send that to parseScannerLogOutput. So you're still down to executing 2 commands, albeit that both of them are now git log so the RefLog command isn't needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I tossed out my attempt to use logLfsSearchArgs; that was a non-starter. All I was thinking of was changing the first couple of lines of this function to use git.Log() and the git log -g ... command format. (And according to the folks I consulted, git log -g is equivalent to git reflog show.

That said, based on the feedback I got, I realized there may be a different, more "interesting" issue yet to solve here, which is what happens when someone uses git stash pop --index and had LFS file changes in both their index and their working tree. I'll write that up in the main comments, though.

git/git.go Outdated
Comment on lines +272 to +276
func RefLog(args ...string) (*subprocess.BufferedCmd, error) {
reflogArgs := append([]string{"reflog"}, args...)
return gitNoLFSBuffered(reflogArgs...)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other inline comment on scanStashed() for some musings on whether we could dodge the need for this new function by using git.Log() to read the stashed refs instead.

@chrisd8088
Copy link
Member

chrisd8088 commented Aug 12, 2020

What was pointed out to me today is that the stash uses multiple commits to store the state of the working tree and the index separately (thanks, @bk2204!), and our various experiments with git reflog show and git log -p both only give the SHAs of the working tree stash commits. That got me thinking that if someone used git stash pop --index and then tried to restore the working tree to the index, it might still cause a problem, even with this PR's changes. And indeed that seems to be the case.

I wrote up a test which demonstrates the outstanding issue here, I believe; at least for me, using this PR as it stands, it failed on the second assert_local_object "$oid_indexstashed" "${#content_indexstashed}" (and in a simpler manual test, without the assert_local_object test helpers, I get a failure on git checkout . when the smudge filter can't download the missing LFS data):

begin_test "prune keep stashed changes in index"
(
  set -e

  reponame="prune_keep_stashed_index"
  setup_remote_repo "remote_$reponame"

  clone_repo "remote_$reponame" "clone_$reponame"

  git lfs track "*.dat" 2>&1 | tee track.log
  grep "Tracking \"\*.dat\"" track.log

  # generate content we'll use
  content_inrepo="This is the original committed data"
  oid_inrepo=$(calc_oid "$content_inrepo")
  content_indexstashed="This data will be stashed from the index and should not be deleted"
  oid_indexstashed=$(calc_oid "$content_indexstashed")
  content_stashed="This data will be stashed and should not be deleted"
  oid_stashed=$(calc_oid "$content_stashed")

  # We just need one commit of base data, makes it easier to test stash
  echo "[
  {
    \"CommitDate\":\"$(get_date -1d)\",
    \"Files\":[
      {\"Filename\":\"stashedfile.dat\",\"Size\":${#content_inrepo}, \"Data\":\"$content_inrepo\"}]
  }
  ]" | lfstest-testutils addcommits

  # now modify the file, and add it to the index
  echo -n "$content_indexstashed" > stashedfile.dat
  git add stashedfile.dat

  # now modify the file again, and stash it
  echo -n "$content_stashed" > stashedfile.dat

  git stash

  # Prove that the stashed data was stored in LFS (should call clean filter)
  assert_local_object "$oid_stashed" "${#content_stashed}"
  assert_local_object "$oid_indexstashed" "${#content_indexstashed}"

  # Prune data, should NOT delete stashed file or stashed changes to index
  git lfs prune

  assert_local_object "$oid_stashed" "${#content_stashed}"
  assert_local_object "$oid_indexstashed" "${#content_indexstashed}"

  # Restore working tree from stash
  git stash pop --index

  # Reset working tree to index from stash
  git checkout .
)
end_test

So I think we should add a test like this, first of all, and also consider how best to walk back through all the SHAs associated with the stash.

@sinbad
Copy link
Contributor Author

sinbad commented Aug 12, 2020

Nice catch, I never realised stash could create >1 commit but thinking about it I guess it makes sense. I guess it can only ever create 2. I'll have another crack at it.

@sinbad
Copy link
Contributor Author

sinbad commented Aug 12, 2020

Here's one failed attempt that I thought would work:

func scanStashed(cb GitScannerFoundPointer, s *GitScanner) error {
	// First get the SHAs of all stashes
	logArgs := []string{"-g", "--format=%h", "refs/stash", "--"}

	cmd, err := git.Log(logArgs...)
	if err != nil {
		return err
	}

	cmd.Start()
	defer cmd.Wait()

	scanner := bufio.NewScanner(cmd.Stdout)

	var stashShas []string
	for scanner.Scan() {
		stashShas = append(stashShas, strings.TrimSpace(scanner.Text()))
	}

	// Stashes are 2 commits, one for index and one for working copy
	// Preserve changes in both separately, in case same file modified twice
	logArgs = stashShas // include all stashes
	logArgs = append(logArgs, "--not", "--branches", "--tags") // exclude those commits reachable locally otherwise
	// Add standard search args to find lfs references
	logArgs = append(logArgs, logLfsSearchArgs...)

	cmd, err = git.Log(logArgs...)
	if err != nil {
		return err
	}

	parseScannerLogOutput(cb, LogDiffAdditions, cmd)
	return nil
}

While this does list the correct commits, the diff generated by the -p option only includes the OID for the combined change for the entire stash, i.e. the working copy version and not the one that was in the index. I can't seem to get the diff to output both and I'm not sure why, it's not how it behaves for other commits.

Looking at the result of git show -p <stash_sha> it seems to be that it knows the changes are different to a normal log entry:

- oid sha256:67ae8ef97d615140fcb42b3fa7e300a5ce5960cd8ff9807eef3dcb85aec3cecd
- size 25
 -oid sha256:32609ffbe33e263e807f7e3b8ee6ae7d990da65de5c15a51f1b62782a8cc7276
 -size 43
++oid sha256:c9d22606d40fec4e99cc7d09fbef109f5301a30e35c177e0d14bafb36894cdc5
++size 64

This looks more like a git status --porcelain style output where positioning of the -/+ tells you whether it's the index or working copy version. In the example above OID 67ae8ef9 (- in col 0) was the previous commit, OID 32609ffbe (- in col 1) was the stashed index, and OID c9d22606d (++) was the working copy version. git log only shows the first and last OIDs even if you're definitely getting the commit inbetween in the output, the diff for it just isn't there.

FYI for stash commits that only had the working copy, git show -p <stash_sha> displays:

--oid sha256:67ae8ef97d615140fcb42b3fa7e300a5ce5960cd8ff9807eef3dcb85aec3cecd
--size 25
++oid sha256:c7fe2c1adc4c9fbe8df7d1bd56b85d98e52288ffc6547fcb7a15fab39a4942ed
++size 15

So I think the best approach is going to be to use git show -p <stash_sha>... and parse out any OIDs which have either "+" or "-" in column 1 of the diff output. That should pick up the index and working copy state of stashes. Thoughts? I'd love to figure out a way to get git log to output that full detail like git show but so far I've had no luck.

@sinbad
Copy link
Contributor Author

sinbad commented Aug 12, 2020

Sorry I pasted the wrong "failed attempt" function in the previous comment at first, fixed now.

@chrisd8088
Copy link
Member

chrisd8088 commented Aug 12, 2020

One other thing we should consider (again, based on the tips I got about how git stash push operates) is that git stash push -u makes yet a third commit, to capture any untracked files.

Doing something like:

echo 123 > foo.dat; git add foo.dat
git commit -m initial
echo 456 > foo.dat
git add foo.dat
echo 789 > foo.dat
echo abc > bar.dat
git stash push -u
git log -g refs/stash

shows something like the following, where the three SHAs in the Merge: line represent the tip of HEAD at the time of the stash (the first parent), the stashed state of the index, and the stashed untracked files from the working tree:

Reflog message: WIP on master: 2678f27 initial
Merge: 2678f27 2762e90 a71ca8c

and the merge commit itself has the stashed changes to the tracked files from the working tree.

@sinbad
Copy link
Contributor Author

sinbad commented Aug 12, 2020

OK that suggests that we need both commands. To summarise:

  1. git log -g --format=%h refs/stash -- to get the 'leaf' stash SHAs
  2. git log -p <stashSHA>... --not --branches --tags (+logSearchArgs) to pick up changes from untracked files and the final diff (but it will miss the diff between working copy and index)
  3. git show -p <stashSHA>... to pick up the diff between working copy & index on same file, if it exists

The set of OIDs from all of those should be the complete set.

@sinbad
Copy link
Contributor Author

sinbad commented Aug 12, 2020

Alternatively if we used the "Merge" line:

Reflog message: WIP on master: a70e14d First
Merge: a70e14d 4cf30f4 1bad043

In my case a70e14d is the parent commit of the stash (which should be ignored), 4cf30f4 is the index contents and 1bad043 is the untracked files. The leaf SHA is then the working copy. If that's a consistent we could ignore the first SHA on the merge line and use the rest plus the leaf as arguments to git show. It's a little more parsing though (becomes 2 stages); in the previous suggestion you only need the leaf SHAs and everything else is discoverable via OID lines (albeit with a slightly different regex to handle index/working copy).

@chrisd8088
Copy link
Member

Yes, exactly -- I realized I needed to edit my description of which commits were which in #4209 (comment) and then noticed you'd already explained the relevant commits more correctly. :-)

@sinbad
Copy link
Contributor Author

sinbad commented Aug 13, 2020

Bad news, I'm not sure exactly what version of git started reporting merge parents for stashes, but it was somewhere later than git 2.7 because during my testing on an older linux setup with older git, I couldn't get git log -g refs/stash to report the merge line, or the parent data when using --format=%p. It's just blank. So I don't think we want to rely on that behaviour.

@sinbad
Copy link
Contributor Author

sinbad commented Aug 13, 2020

For reference in this git 2.7 the parentage of the commits are still the same, it's just that git log -g --format=%p doesn't report them (and neither does git reflog). So at some point the reflog must have been enhanced to support this.

git show --format=%p does show the merge parents of a stash leaf commit, however. So that's probably the way to go.

@sinbad
Copy link
Contributor Author

sinbad commented Aug 13, 2020

OK this took more a lot more work than I expected, but it's finally working. I had to add a "Show" command and I had to also change the OID pointer diff parsing because it couldn't handle merge commits. I added a test which includes all 3 kinds of stash change at once because there was a bug that only happened when you had all 3 together.

Hoping this is all OK now!

@sinbad sinbad requested a review from chrisd8088 August 13, 2020 13:36
@chrisd8088
Copy link
Member

OK, wow, that's a lot of work indeed, thank you! I plan to take a look tomorrow but from the tests it looks like this covers a lot of territory and all the use cases. ❤️

@chrisd8088
Copy link
Member

Thanks so much for sticking with this! I just wanted to say I'm still mid-review and have not forgotten this, and will get you more feedback as soon as I can. Thanks so much again!

Copy link
Member

@chrisd8088 chrisd8088 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely heading in the right direction; thank you very much for the full test suite (yay!) and the manual page change, as well as all the implementation work!

I wondered about two things in particular, one of which I've tried in 7559392 and which seemed to work (avoiding the need to parse combined diffs), and the other of which I only speculated on, which is maybe using the same git log -m trick to just get all the necessary data from the SHAs we get from the initial reflog query (the git log -g one).

I will try to get to some experimentation with that latter idea, but may not achieve it for a day or more. My apologies for the slow responses here, and thank you so much again for all the time and effort getting this into shape (not to mention reporting it in the first place!)

any which are not referenced by at least ONE of the following:

* the current checkout
* all existing stashes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

When looking for recent objects to retain during a "git lfs prune",
we can parse the output from "git show" on the SHAs from merge
commits generated by "git stash" without needing to parse the
combined diff format if we pass the -m flag to "git show".  This
forces Git to generate diffs of the individual parents of the
merge; see https://git-scm.com/docs/git-show#_combined_diff_format
for details.

This allows us to simplify some of the changes from commit
4049928 and keep the scan() and
newLogScanner() functions unchanged from before, when they only
accepted traditional unified diffs, not combined diffs.
When looking for recent objects to retain during a "git lfs prune",
we use "git log -m" to collect diffs of all stash commits, including
all the parents of the WIP merge commits from the reflog, where
index changes and untracked files are recorded for each stash.

This allows us to further simplify some of the changes from commit
4049928 and avoid the need to
introduce a git.Show() function to use "git show"; it also lets
us iterate just twice, once over the reflog for refs/stash to get
the SHAs of the WIP merge commits, and then over all the individual
parent diffs output by "git log -m" for those merges.
Copy link
Member

@chrisd8088 chrisd8088 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to approve this as it now stands, but also likely ask @bk2204 to take a second look, as I've added some of my own work here.

@chrisd8088 chrisd8088 requested a review from bk2204 August 27, 2020 05:53
@sinbad
Copy link
Contributor Author

sinbad commented Aug 27, 2020

Looks good to me! 👍

Copy link
Member

@bk2204 bk2204 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks great. I found two minor things that we might want to fix, but I'm pretty happy with the way this turned out.

Per PR advice from bk2204, we switch to using printf instead of
echo in the t-prune.go tests of stashed changes.

We also fixup one code comment to match the count of processes
in commands/command_prune.go.
@chrisd8088 chrisd8088 merged commit 3fd8660 into git-lfs:master Sep 4, 2020
@chrisd8088 chrisd8088 mentioned this pull request Aug 13, 2021
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Apr 25, 2022
The "prune keep stashed changes", "prune keep stashed changes
in index", and "prune keep stashed untracked files" tests ensure
that Git LFS objects referenced by stashes are always retained
when the "git lfs prune" command is run.

However, they do not at present check that objects belonging to
files which match a "lfs.fetchexclude" pattern, which would
otherwise be pruned regardless of their status, will be retained
because they have been stashed.  Logic in the "git lfs prune"
command to retain objects from stashed changes was added in
PR git-lfs#4209, and it should demonstrate this behaviour, but no checks
actually confirm it.

We therefore expand these three tests so that their simulated
commit histories include objects whose file path matches a path we
then exclude using the "lfs.fetchexclude" filter.  We then confirm
that all these objects are retained by "git lfs prune" when they are
stashed, included when they are stashed from the index and when they
are untracked files stashed with "git stash -u".
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Apr 25, 2022
The "prune keep stashed changes", "prune keep stashed changes
in index", and "prune keep stashed untracked files" tests ensure
that Git LFS objects referenced by stashes are always retained
when the "git lfs prune" command is run.

However, they do not at present check that objects belonging to
files which match a "lfs.fetchexclude" pattern, which would
otherwise be pruned regardless of their status, will be retained
because they have been stashed.  Logic in the "git lfs prune"
command to retain objects from stashed changes was added in
PR git-lfs#4209, and it should demonstrate this behaviour, but no checks
actually confirm it.

We therefore expand these three tests so that their simulated
commit histories include objects whose file path matches a path we
then exclude using the "lfs.fetchexclude" filter.  We then confirm
that all these objects are retained by "git lfs prune" when they are
stashed, included when they are stashed from the index and when they
are untracked files stashed with "git stash -u".
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Apr 25, 2022
The "prune keep stashed changes", "prune keep stashed changes
in index", and "prune keep stashed untracked files" tests ensure
that Git LFS objects referenced by stashes are always retained
when the "git lfs prune" command is run.

However, they do not at present check that objects belonging to
files which match a "lfs.fetchexclude" pattern, which would
otherwise be pruned regardless of their status, will be retained
because they have been stashed.  Logic in the "git lfs prune"
command to retain objects from stashed changes was added in
PR git-lfs#4209, and it should demonstrate this behaviour, but no checks
actually confirm it.

We therefore expand these three tests so that their simulated
commit histories include objects whose file path matches a path we
then exclude using the "lfs.fetchexclude" filter.  We then confirm
that all these objects are retained by "git lfs prune" when they are
stashed, included when they are stashed from the index and when they
are untracked files stashed with "git stash -u".
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Apr 25, 2022
The "prune keep stashed changes", "prune keep stashed changes
in index", and "prune keep stashed untracked files" tests ensure
that Git LFS objects referenced by stashes are always retained
when the "git lfs prune" command is run.

However, they do not at present check that objects belonging to
files which match a "lfs.fetchexclude" pattern, which would
otherwise be pruned regardless of their status, will be retained
because they have been stashed.  Logic in the "git lfs prune"
command to retain objects from stashed changes was added in
PR git-lfs#4209, and it should demonstrate this behaviour, but no checks
actually confirm it.

We therefore expand these three tests so that their simulated
commit histories include objects whose file path matches a path we
then exclude using the "lfs.fetchexclude" filter.  We then confirm
that all these objects are retained by "git lfs prune" when they are
stashed, included when they are stashed from the index and when they
are untracked files stashed with "git stash -u".
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Apr 25, 2022
The "prune keep stashed changes", "prune keep stashed changes
in index", and "prune keep stashed untracked files" tests ensure
that Git LFS objects referenced by stashes are always retained
when the "git lfs prune" command is run.

However, they do not at present check that objects belonging to
files which match a "lfs.fetchexclude" pattern, which would
otherwise be pruned regardless of their status, will be retained
because they have been stashed.  Logic in the "git lfs prune"
command to retain objects from stashed changes was added in
PR git-lfs#4209, and it should demonstrate this behaviour, but no checks
actually confirm it.

We therefore expand these three tests so that their simulated
commit histories include objects whose file path matches a path we
then exclude using the "lfs.fetchexclude" filter.  We then confirm
that all these objects are retained by "git lfs prune" when they are
stashed, included when they are stashed from the index and when they
are untracked files stashed with "git stash -u".
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Jun 6, 2023
In commit 4049928 of PR git-lfs#4209 the
scanStashed() function of the "lfs" package was revised to no longer
make use of the *GitScanner argument that was part of its original
implementation from commit 2dc718b
in the same PR.

However, the argument was never dropped from the function's signature,
so we can remove it now.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Jun 7, 2023
In commit 4049928 of PR git-lfs#4209 the
scanStashed() function of the "lfs" package was revised to no longer
make use of the *GitScanner argument that was part of its original
implementation from commit 2dc718b
in the same PR.

However, the argument was never dropped from the function's signature,
so we can remove it now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

git lfs prune should not delete objects referenced by stashes

3 participants