Skip to content

Allow reporting of LFS pointers separately in migrate info command#4436

Merged
chrisd8088 merged 6 commits intogit-lfs:mainfrom
chrisd8088:migrate-info-skips
May 3, 2021
Merged

Allow reporting of LFS pointers separately in migrate info command#4436
chrisd8088 merged 6 commits intogit-lfs:mainfrom
chrisd8088:migrate-info-skips

Conversation

@chrisd8088
Copy link
Member

@chrisd8088 chrisd8088 commented Mar 12, 2021

The output of the git lfs migrate info command, when the repository history contains existing Git LFS pointers, treats those pointers as regular files and counts only the size of the pointer, not the referenced object. This can result in misleading output for repositories where there is a potential use case for the git lfs migrate import --fixup command and option, among others.

This PR adds a --pointers option to the git lfs migrate info command, which accepts one of three values, ignore, follow, or no-follow (the default). The latter is the default only because it ensures we maintain backwards compatibility with the output of the command in previous Git LFS versions.

The --pointers=ignore option simply skips any existing Git LFS pointers when calculating the totals for the summary, while the follow alternative reports (on a separate line) the total sizes and numbers of Git LFS objects in the traversed Git history. For example, on a small test repo:

$ git lfs migrate info --pointers=follow --everything
...                              
*.gitattributes	69 B	2/2 files(s)	100%
*.txt          	1 B 	1/1 files(s)	100%

LFS Objects    	1 B 	1/1 files(s)	100%

We also adjust the sorting logic for the summary output to make it fully deterministic; previously, if two entries for different file extensions had the same totals, they might appear in any order, but now we sort them using Go's default string comparison ordering.

In the t/t-migrate-info.sh test suite, we remove a spurious early exit 0 which has been in place for some time, meaning the last test was always skipped. We then add additional tests of the new --pointers option and its various settings, with and without Git LFS objects in the test repositories.

We also add, for good measure, a pair of tests which check the behaviour of the git lfs migrate info command when there are files in the history which should be Git LFS objects according to the corresponding .gitattributes files but are not. This is another type of situation where git lfs migrate import --fixup might be used, so we want to ensure the info subcommand produces correct output for it as well.

@chrisd8088 chrisd8088 changed the title [WIP] Ignore LFS pointers and .gitattributes files in migrate info command [WIP] Report LFS pointers separately in migrate info command Mar 30, 2021
@chrisd8088 chrisd8088 force-pushed the migrate-info-skips branch 3 times, most recently from 399afc1 to c09081c Compare May 2, 2021 03:00
chrisd8088 added 5 commits May 1, 2021 21:31
The last test in this file has been accidentally skipped since
it was added in commit c8c29f5
in PR git-lfs#2558, so we just ensure it runs by removing the early
exit call.

We also change one use of "wc -l" to "wc -c" to ensure we are
indeed checking that the command outputs nothing; otherwise,
if it happened to output one line, the "echo -n" would remove
the trailing newline and "wc -l" would report zero.
At present the summary of file types and sizes output by the
"migrate info" command is sorted only by the byte count totals.
When two entries have the same counts, though, the order in
which they are output is not defined, and may vary.  This will
pose a problem for some tests we expect to add in subsequent
commits.

Therefore we handle the case where two entries have identical
byte counts by further comparing the file types and returning
whichever sorts first lexigraphically, according to Go's
internal string sort order.
At present, the "git lfs migrate info" command reports
the total sizes, numbers, and file extensions for blobs
which have already been converted to LFS pointers, and
it uses the size of the pointer blobs themselves instead
of the size of object data referenced by the pointers.

The result is a confusing output which does not distinguish
these types, has misleading size totals, and generates
output listings even for repositories fully migrated to LFS.

We therefore add a new --pointers option to the "migrate info"
command so that a summary of all valid LFS pointers found in
the repository traversal may be either skipped entirely or
reported separately.

To support this option we add an lfs.DecodePointerFromBlob()
function which parallels the existing lfs.DecodePointerFromFile()
function.

We also refactor some of the entry lookup logic in the
BlobFn callback function used in the migrateInfoCommand()
function into a new findEntryByExtension() utility function.
We add a number of tests of the functionality of the new
--pointers option to the "migrate info" command in order
to confirm it works as expected in all modes, and both with
and without Git LFS pointers in the test repositories.

We also add tests of repositories where some file extensions
are tracked in .gitattributes files and the corresponding blobs
should be Git LFS pointers, but are not yet.  In these cases,
we do want to report these file extensions in summary data of
the output listing.

In order to ensure our tests succeed on Windows, we alter
several fixture setup functions to create the .gitattributes
files using "echo" instead of "git lfs track".  This keeps
the line endings consistent across platforms, and therefore
the .gitattributes file sizes in the "migrate info" summaries.
We update the manual page for the "git lfs migrate" command
to describe the use and purpose of the new --pointers option
of the "info" command mode.
@chrisd8088 chrisd8088 force-pushed the migrate-info-skips branch from 5b0876d to 47e10cd Compare May 2, 2021 04:31
@chrisd8088 chrisd8088 changed the title [WIP] Report LFS pointers separately in migrate info command Allow reporting of LFS pointers separately in migrate info command May 2, 2021
@chrisd8088 chrisd8088 requested a review from a team May 2, 2021 04:54
@chrisd8088 chrisd8088 marked this pull request as ready for review May 2, 2021 05:02
Copy link
Member

@bk2204 bk2204 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.

One note: since we're doing a 3.0 release, you can break backwards compatibility if you want and think that would be more intuitive.

Use the new "follow" mode by default when running the
"git lfs migrate info" command, since we can break with
backwards compatibility for a Git LFS 3.0 release.
@chrisd8088
Copy link
Member Author

Alright, thanks! I've flipped the sense of the default around, then. (Hopefully this work will get me closer to the --fixup option proposed in #4419.)

@chrisd8088 chrisd8088 merged commit 68159d7 into git-lfs:main May 3, 2021
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request May 17, 2021
We split out the existing tests of various values of the
"git lfs migrate info" --pointers option, added in PR git-lfs#4436,
in order to treat each value in a separate test.  This has
the small advantage that if support for a particular option
value should be broken unexpectedly in the future, it
should result in a specific test failure instead of a more
generic one.
@chrisd8088 chrisd8088 deleted the migrate-info-skips branch May 17, 2021 05:52
pcal43 pushed a commit to pcal43/git-lfs-hack that referenced this pull request Jul 22, 2021
We split out the existing tests of various values of the
"git lfs migrate info" --pointers option, added in PR git-lfs#4436,
in order to treat each value in a separate test.  This has
the small advantage that if support for a particular option
value should be broken unexpectedly in the future, it
should result in a specific test failure instead of a more
generic one.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants