commands: add git-lfs-migrate(1) 'info' subcommand#2313
Conversation
technoweenie
left a comment
There was a problem hiding this comment.
The code looks good, but I have some minor review comments.
One thing: I don't think showing cumulative file size for a file type is that helpful. How can the info command help you decide what to migrate?
- Allow user to pass in some kind of arbitrary size threshold.
git lfs migrate info --size=5MB - Show some kind of histogram of file sizes. Median, 75/95/99 percentile?
commands/command_migrate.go
Outdated
| // `git-lfs-migrate(1)` command as a subcommand (child). | ||
|
|
||
| for _, subcommand := range []*cobra.Command{ | ||
| MigrateInfoCommand, |
There was a problem hiding this comment.
I don't think it makes sense to declare this MigrateInfoCommand var in commands/command_migrate_info.go if some of the flags are added here. They should all be set up in this init() func. What do you think?
There was a problem hiding this comment.
I think that's a great idea; it should make readers more easily aware of the various migrate subcommands. I made this change via: 1f443d0.
docs/man/git-lfs-migrate.1.ronn
Outdated
| ## MODES | ||
|
|
||
| * `info` | ||
| Perform no migration, instead show information about repository size. |
There was a problem hiding this comment.
I think you trim the first part of this sentence. Talk about what it does.
There was a problem hiding this comment.
Thanks for the suggestion, I much prefer your wording: d453d9d.
docs/man/git-lfs-migrate.1.ronn
Outdated
|
|
||
| * [branch ...]: | ||
| Migrate only the set of branches listed. If not given, `git-lfs-migrate(1)` | ||
| will the currently checked out branch. |
There was a problem hiding this comment.
Looks like you're missing a word here :)
|
@technoweenie good call re: surfacing more relevant information. Here's an updated format I implemented in f3e0af6. What do you think? |
technoweenie
left a comment
There was a problem hiding this comment.
This is looking really great. A few comments:
| } | ||
|
|
||
| // TODO(@ttaylorr): needed? | ||
| exts[ext] = entry |
There was a problem hiding this comment.
I think it's either needed here, or in the entry == nil block.
There was a problem hiding this comment.
I think either works, but I dig the explicitness of exts[ext] = entry. Removed the TODO comment in 6b94d8b.
commands/command_migrate_info.go
Outdated
| } | ||
|
|
||
| // MigrateInfoEntry represents a tuple of filetype to total size taken by that | ||
| // file type. |
There was a problem hiding this comment.
I think this description is a bit out of date, since the file type to entry mapping is the exts var.
There was a problem hiding this comment.
Nice catch! Updated those docs and added some godoc to the fields on this type in 7c77064.
| percentage := fmt.Sprintf("%.0f%%", percentAbove) | ||
|
|
||
| percentages = append(percentages, percentage) | ||
| } |
There was a problem hiding this comment.
Two minor things about this:
First, it's given an io.Writer, but the output is buffered. It's not exactly a lot of data, but I'm just used to writing it out with fmt.Fprintf if the output doesn't need post-processing. However, having to deal with just 1 err from fmt.Fprintln is kinda nice.
Second: You could reduce how often e is looped by building extensions, files, and percentages at the same time. At a high level, the structure of this func would be something like:
- Init
extensions,files,percentages - Loop through
e, appending to those slices RjustorLjustthem- Write header
- Write each line with the justified
extensions,files,percentages
I think structurally it makes more sense, but 36179de saves a whopping 3 LOC (7 if I didn't add comments).
There was a problem hiding this comment.
However, having to deal with just 1
errfromfmt.Fprintlnis kinda nice.
Totally agree.
Second: You could reduce how often e is looped by building
extensions,files, andpercentagesat the same time.
Thank you! This is such an awesome suggestion. I changed this in 0dbc358 and it made the code much cleaner.
|
I think this output looks great, but the file sizes should be aligned too: Aligned, it'd look something like: In chat, you questioned out loud about how to parse this output with scripts:
|
| migrateInfoAboveFmt string | ||
| // migrateInfoAbove is the number of bytes parsed from the above | ||
| // migrateInfoAboveFmt flag. | ||
| migrateInfoAbove uint64 |
There was a problem hiding this comment.
I think migrateInfoAbove should have a sensible default, like 1MB.
There was a problem hiding this comment.
I think this output looks great, but the file sizes should be aligned too:
Good call: I built up a size array as well, and aligned it in 7d2dbd6. I think it looks much better this way!
I agree, and I really dig this reasoning. I think it a) makes the output easier to look at, b) doesn't repeat already known information, and c) makes the output easier to parse with Unix-y tools. Removed the header in ae76e72.
Totally possible, and I think this would make sense for a lot of command-line tools that want to parse the output. |
Good call, I made the default 1MB in 97b0f36. This can be easily changed in the future if 1MB is too high/low.
Started looking into this, but I think this may be best done in a separate PR after this one, since it will involve changes in both the humanize package and the |
|
@technoweenie so long as I follow up on the |
|
Thanks for the review! |
The ClampInt() function does not always return a value between its "min" and "max" arguments because it reverses their meaning, and the relevant tests in the corresponding tools/math_test.go file do not report the problem because do not run when the "go test ./tools" command is run. The latter bug is a result of the test functions lacking the necessary Test* prefix to be recognized by the Go "test" command. After we fix the ClampInt() function, we rename all the test functions in tools/math_test.go to follow the Test*() naming scheme, which ensures they now run properly. Lastly, we change the one caller of the ClampInt() function, the migrateInfoCommand() function, which previously worked correctly because it also reversed the order of the arguments it passed to the function. That reversal was introduced in commit bd2e1b2 in PR git-lfs#2313, as part of the development of the "migrate info" command, presumably because the author discovered the call was not working as expected.
The ClampInt() function does not always return a value between its "min" and "max" arguments because it reverses their meaning, and the relevant tests in the corresponding tools/math_test.go file do not report the problem because they do not execute when the "go test ./tools" command is run. The latter bug is a result of the test functions lacking the necessary Test* prefix to be recognized by the Go "test" command. After we fix the ClampInt() function, we rename all the test functions in tools/math_test.go to follow the Test*() naming scheme, which ensures they now run properly. Lastly, we change the one caller of the ClampInt() function, the migrateInfoCommand() function, which previously worked correctly because it also reversed the order of the arguments it passed to the function. That reversal was introduced in commit bd2e1b2 in PR git-lfs#2313, as part of the development of the "migrate info" command, presumably because the author discovered the call was not working as expected.
The ClampInt() function does not always return a value between its "min" and "max" arguments because it reverses their meaning, and the relevant tests in the corresponding tools/math_test.go file do not report the problem because they do not execute when the "go test ./tools" command is run. The latter bug is a result of the test functions lacking the necessary Test* prefix to be recognized by the Go "test" command. After we fix the ClampInt() function, we rename all the test functions in tools/math_test.go to follow the Test*() naming scheme, which ensures they now run properly. Lastly, we change the one caller of the ClampInt() function, the migrateInfoCommand() function, which previously worked correctly because it also reversed the order of the arguments it passed to the function. That reversal was introduced in commit bd2e1b2 in PR git-lfs#2313, as part of the development of the "migrate info" command, presumably because the author discovered the call was not working as expected.
This pull request adds documentation and an implementation for the 'info' subcommand of the
git-lfs-migrate(1)suite.The 'info' subcommand is designed to gather statistics regarding the size of data stored in Git per filepath extension. The output of the command looks as follows:
The following arguments are permitted:
--include-ref: a set of references of which commits reachable by those refs are included in the traversal.--exclude-ref: a set of references of which commits reachable by those refs are excluded in the traversal.--include: a filepath filter include set--exclude: a filepath filter exclude set<branch ...>: an optional set of branches to include, defaulting to the currently checked out branch, or those references given in--{include,exclude}-ref.Here's a summary of the changes:
git-lfs-migrate(1).tools.ClampInt.infosubcommandinfosubcommandThis is the first command to be implemented as described in the
git-lfs-migrate(1)proposal issue: (see discussion: #2146)./cc @git-lfs/core