Define lockable files, make read-only in working copy when not locked#1822
Define lockable files, make read-only in working copy when not locked#1822sinbad merged 32 commits intolocking-master-2from
Conversation
Also add functions to fix writeable flag based on these patterns
Also requires dropping recursive option which is fine since always recursive
|
This is great! The fact that it's opt-in and per file/file type and can be globally disabled is excellent, it means that if a certain tool/workflow just breaks because of the readonly flags, it can be turned off while still being useful for tools that can take advantage of it. Can we have some sort of "reset" command for git-lfs that restores readonly flags to all files that should have them (given env variables and lockable flag attributes) but for some reason don't? This would be useful for the scenario where a reset happens and a tool that is lfs-locking-aware detects it and wants to set the correct readonly states back. A tool would need to verify the environment, lockable configuration and locked files list to figure out which files need the flag reset. Would a |
|
@shana glad you like it! I've based this on my own experience in game projects, hoping it pans out for others. A |
technoweenie
left a comment
There was a problem hiding this comment.
This looks pretty cool. I think the track command seriously needs to be rewritten, but not in this PR :)
I also second @shana's request for a command to check/reset the readonly settings. It may make sense to add this to git lfs checkout, though that could overload its purpose (which right now, is to replace any LFS pointer files with the content from .git/lfs/objects).
commands/command_post_checkout.go
Outdated
|
|
||
| // Skip this hook if no lockable patterns have been configured | ||
| if len(lockClient.GetLockablePatterns()) == 0 || | ||
| !cfg.Os.Bool("GIT_LFS_SET_LOCKABLE_READONLY", true) { |
There was a problem hiding this comment.
I think it makes more sense if this check is moved higher, so it can skip initializing the lock client if readonly files are disabled.
docs/man/git-lfs-config.5.ronn
Outdated
| This environment variable controls whether files marked as 'lockable' in | ||
| `git lfs track` are made read-only in the working copy when not locked by | ||
| the current user. The default is `true`, you can disable this behaviour and | ||
| have all files writeable by setting this variable to 0, 'no' or 'false'. |
There was a problem hiding this comment.
Any thoughts on adding a git config value? That would let users configure this per repo, if they wanted.
There was a problem hiding this comment.
Oh yes indeed, good point, having granularity at the repo level on whether this is enabled or not would be very welcome.
There was a problem hiding this comment.
Good shout, adding that now and a wrapper Configuration method to combine the two.
| return paths | ||
| } | ||
|
|
||
| func findAttributeFiles(workingDir, gitDir string) []string { |
There was a problem hiding this comment.
Would returning an unbound channel here let this start parsing the .gitattributes contents while tools.FastWalkGitRepo is doing its thing?
There was a problem hiding this comment.
Unfortunately this is not possible; see the end of the function where the order of the files is reversed to ensure the correct precedence of attributes is maintained.
locking/lockable.go
Outdated
| // Always make non-nil even if empty | ||
| c.lockablePatterns = make([]string, 0, 10) | ||
|
|
||
| paths := git.GetAttributePaths(config.LocalWorkingDir, config.LocalGitDir) |
There was a problem hiding this comment.
This brings back the config import to the locking package. I'd prefer locking.Client get LocalWorkingDir and LocalGitDir properties, set in locking.NewClient(*config.Configuration).
The API refactor will move that to this commands helper.
There was a problem hiding this comment.
OK; even after doing this it will still require the config package, until the per-package config comes in, but at least it puts it in one place ready for that I guess.
There was a problem hiding this comment.
I'm with @technoweenie, let's keep this import light.
locking/locks.go
Outdated
|
|
||
| // Make non-writeable if required | ||
| if c.IsFileLockable(path) && | ||
| config.Config.Os.Bool("GIT_LFS_SET_LOCKABLE_READONLY", true) { |
There was a problem hiding this comment.
I'd like this to be a property on locking.Client too, set with the default value in locking.NewClient().
|
One question: is it possible to add a lockable, but not LFS file? Should it be? I suppose the entry would just look something like: |
So this would be really interesting to have, for this: There are files that are technically hard to merge by git or by humans (due to unpredictable serialization behaviours requiring smart merge tools that can be brittle), but are not necessarily binary files (or they are binary files but very small ones). Putting these in LFS for the purposes of having a locking feature for them adds a lot of overhead that isn't really needed, and in the case of text files that are just hard to merge, is actual a detriment to the process (turning a text file into a binary blob unnecessarily). |
It's not possible at the moment. I think you could probably find some cases where this would be useful but they'd be a minority; most of the time it's binary files that you'd want in LFS anyway I think. Adding this as an orthogonal concept would make things more complex (and may even suggest making lockable a separate tool to |
I mentioned you can trigger this via |
…laces `locking` still depends on `config` but will be refactored out later
ttaylorr
left a comment
There was a problem hiding this comment.
Looking good. I left a few comments that I'd love your thoughts on, and then I'll give this another review.
git/attribs.go
Outdated
| if reldir := filepath.Dir(relfile); len(reldir) > 0 { | ||
| pattern = filepath.Join(reldir, pattern) | ||
| } | ||
| lockable := strings.Contains(line, LockableAttrib) |
There was a problem hiding this comment.
This function looks good, but I'm thinking that this line would lock a file named "lockable" (or one that contains the substring "lockable") as lockable on disk, even if the user didn't want it to be. This is sort of edgecase-y, but I think it's still behavior worth checking. Maybe a more proper .giattributes parser would be worth investigating?
There was a problem hiding this comment.
That's a good point, if the file entry in .gitattributes was specific enough that could be a problem - it's not likely but could be a nasty edge case in that 0.0001% and experience says that's worth addressing.
There was a problem hiding this comment.
Turns out there's a cheap way to do it, we're already splitting on whitespace so just need to check fields[1:] instead. .gitattributes always splits on plain whitespace (you have to use [[:space:]] in file patterns) so this works.
errors/errors.go
Outdated
| } | ||
|
|
||
| func Combine(errs []error) error { | ||
| if len(errs) > 0 { |
There was a problem hiding this comment.
Nit: I'd prefer a smaller if statement here:
if len(errs) == 0 {
return nil
}
// ...It'd be nice if we didn't need the if statement at all, but I think this is at least a small step forward 👍
There was a problem hiding this comment.
👍 I submit PRs to projects with opposing code styles for this case and sometimes I forget 😄
| // marked as lockable | ||
| func (c *Client) GetLockablePatterns() []string { | ||
| c.ensureLockablesLoaded() | ||
| return c.lockablePatterns |
There was a problem hiding this comment.
This read isn't guarded by a mutex-lock. I think using a sync.RWMutex would be appropriate here.
There was a problem hiding this comment.
A read-only mutex here wouldn't do anything useful unless the caller also locked it.
ensureLockablesLoaded ensures that c.lockablePatterns is only updated by one thread, and refreshLockablePatterns deliberately replaces the entire map (rather than clearing and re-populating it) so that previous callers to GetLockablePatterns can continue to use the old map. Therefore a read-only mutex is not required provided the return from GetLockablePatterns is not modified. Unfortunately Go doesn't have a const marker for returns, I thought about copying the map each time but ultimately thought it wasn't worth it.
locking/lockable.go
Outdated
| // Always make non-nil even if empty | ||
| c.lockablePatterns = make([]string, 0, 10) | ||
|
|
||
| paths := git.GetAttributePaths(config.LocalWorkingDir, config.LocalGitDir) |
There was a problem hiding this comment.
I'm with @technoweenie, let's keep this import light.
locking/lockable.go
Outdated
| // You must have locked the c.lockableMutex in the caller | ||
| func (c *Client) refreshLockablePatterns() { | ||
| // Always make non-nil even if empty | ||
| c.lockablePatterns = make([]string, 0, 10) |
There was a problem hiding this comment.
This can be initialized after calling git.GetAttributePaths to avoid resizing:
paths := git.GetAttributePaths(...)
c.lockablePatterns = make([]string, 0, len(paths))
for _, p := range paths {
if p.Lockable {
c.lockablePatterns = append(c.lockablePatterns, p.Path)
}
}
// ...
locking/locks.go
Outdated
|
|
||
| return lock, nil | ||
| // Ensure writeable on return | ||
| return lock, tools.SetFileWriteFlag(path, true) |
There was a problem hiding this comment.
I dislike the pattern of returning a single error via a method call in the return signature, because I think this hides too much. I prefer:
if err := tools.SetFileWriteFlag(path, true); err != nil {
return Lock{}, err
}
return lock, nilThere was a problem hiding this comment.
I'm sure I've had the opposite feedback before 😆 Fair enough.
lfs/hook.go
Outdated
| func NewStandardHook(theType string, upgradeables []string) *Hook { | ||
| return &Hook{ | ||
| Type: theType, | ||
| Contents: strings.Replace(hookBaseContent, "__CMD__", theType, -1), |
There was a problem hiding this comment.
I'd prefer to use a text/template instead of string replacement, since the later is a common Go convention and is well understood by looking at the hookBaseContent variable alone.
There was a problem hiding this comment.
That would be a lot of cruft for what will boil down to a global find/replace even with a template. If you like I can make the string look like a template and still use strings.Replace but I see no reason to clutter this with all the init/struct param/execute patterns of templates.
| // | ||
| // This hook checks that files which are lockable and not locked are made read-only, | ||
| // optimising that as best it can based on the available information. | ||
| func postCheckoutCommand(cmd *cobra.Command, args []string) { |
There was a problem hiding this comment.
This command needs a manpage entry.
| LoggedError(err, "Warning: post-checkout rev diff %v:%v failed: %v\nFalling back on full scan.", pre, post, err) | ||
| postCheckoutFileChange(client) | ||
| } | ||
| err = client.FixLockableFileWriteFlags(files) |
| // | ||
| // This hook checks that files which are lockable and not locked are made read-only, | ||
| // optimising that as best it can based on the available information. | ||
| func postMergeCommand(cmd *cobra.Command, args []string) { |
There was a problem hiding this comment.
This command needs a manpage entry.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
That would be a lot of cruft for what will boil down to a global find/replace even with a template. If you like I can make the string look like a template and still use `strings.Replace` but I see no reason to clutter this with all the init/struct param/execute patterns of templates.
I think that would be a great compromise 👍 .
|
|
Problem with that is that `grep -c` returns a failure code when there are no results (which we validly test for in many cases), which means you lose the gained clarity trying to ignore that. I think `wc -l` is clearer while still dealing with `0` result cases.
I've had success with stringify-ing it, but either works for me :-)
```sh
[ "0" -eq "$(grep -c foo bar)" ]
```
|
|
Yes, because I'm overwriting the same file. I can't progressively read and
overwrite it at the same time 😄
My mistake, I totally missed this on my first read-through of the code.
I could write a new attributes file then delete & rename, but it's more
complex and I prefer simpler. I recognise that this way is heavier on memory
usage, but `.gitattributes` files are never going to be massive (even a 1MB
attributes file would be unthinkable) so it's a not really an issue.
I think this is a great analysis, and OK to leave as-is.
|
|
Fantastic!
|
|
A read-only mutex here wouldn't do anything useful unless the caller also
locked it.
`ensureLockablesLoaded` ensures that `c.lockablePatterns` is only updated by
one thread, and `refreshLockablePatterns` deliberately replaces the entire map
(rather than clearing and re-populating it) so that previous callers to
`GetLockablePatterns` can continue to use the old map. Therefore a read-only
mutex is not required provided the return from `GetLockablePatterns` is not
modified. Unfortunately Go doesn't have a `const` marker for returns, I
thought about copying the map each time but ultimately thought it wasn't worth
it.
Sounds good, and thanks for the detailed explanation :-).
|
Still using strings.Replace, this doesn’t justify using text/template
| setting, control whether files marked as 'lockable' in `git lfs track` are | ||
| made read-only in the working copy when not locked by the current user. | ||
| The default is `true`; you can disable this behaviour and have all files | ||
| writeable by setting either variable to 0, 'no' or 'false'. |
There was a problem hiding this comment.
@ttaylorr If the lockable feature is on, then Git LFS cannot skip the post-checkout command. Consequently, Git LFS needs to search for all .gitattributes files to check them for lockable patterns.
My team is working with a large repo (>20k directories, >300k files) and just searching for the .gitattributes file in all directories takes ~6 sec on macOS with a fast SSD. Consider these timings:
$ export GIT_LFS_SET_LOCKABLE_READONLY=1; time git-lfs post-checkout a0c5174a19f2dece44c09bc76f91feb69799e66f a0c5174a19f2dece44c09bc76f91feb69799e66f 0
git-lfs post-checkout a0c5174a19f2dece44c09bc76f91feb69799e66f 0
35.09s user 6.41s system 730% cpu 5.679 total
$ export GIT_LFS_SET_LOCKABLE_READONLY=0; time git-lfs post-checkout a0c5174a19f2dece44c09bc76f91feb69799e66f a0c5174a19f2dece44c09bc76f91feb69799e66f 0
git-lfs post-checkout a0c5174a19f2dece44c09bc76f91feb69799e66f 0
0.01s user 0.02s system 89% cpu 0.030 total
I plan to distribute the setting lfs.setlockablereadonly=false in my company gitconfig but it took me a while to figure out the culprit of the slowness. I don't know how widely the lockable feature is used, but I wonder if lfs.setlockablereadonly=falseis a more sensible default? Alternatively we could find ways to speed up the .gitattributes discovery.
There was a problem hiding this comment.
I don't know how widely the lockable feature is used, but I wonder if
lfs.setlockablereadonly=falseis a more sensible default?
I think =true is an important default value to keep. Here are a few thoughts on how we can get the best of both worlds:
-
Speed up the time it takes to search for
.gitattributes. This has been optimized by @sinbad in Fast directory walk #1616. I think that this work could be augmented by firststat()-ing directories to search, and then seeing if the stat is out-of-date with the last time that the directory was traversed, effectively memoizing the search. I presume that this would improve the speed a great deal. -
Warn when the search is taking too long. We can set a timer (similar to what Microsoft does with GVFS when the
--ahead-behindcheck takes too long) and warn if Git LFS is unable to determine the set of lockable files quickly.
What do you think? Perhaps we should have this discussion in a new issue, as well.
In commit d91980f of PR git-lfs#1822 the GetFilesChanged() function was updated to use "git diff-tree" rather than "git diff". However, the relevant error messages were not updated at the same time, so we do that now.
In PR git-lfs#1870, which was based on the work in PR git-lfs#1822, we introduced support for "lockable" files whose permissions would be managed by several new Git LFS commands such as "git lfs post-checkout" and "git lfs post-merge", both of which we register as Git hooks. In commit 585befa of PRs git-lfs#1822 and git-lfs#1870 we added manual pages for these two commands. (Although these PRs also introduced the "git lfs post-commit" command, they did not include a manual page for it, so we added one belatedly in commit ae5d5b0 of PR git-lfs#4052.) From the time it was written, our git-lfs-post-checkout(1) manual page has indicated that the command accepts three arguments. This accurately reflects the fact that Git will invoke our hook command with parameters specifying the SHAs from before and after the checkout operation, plus a status flag indicating whether a branch checkout was performed or not. However, our manual page does not explain what values these arguments are expected to contain, other than by dint of the names given to the arguments in the page's synopsis section. We therefore expand our description of the "git lfs post-checkout" command to detail the values Git will provide in each of the three arguments. We also take the opportunity to adjust the name of the second argument to clarify that we expect this parameter to contain a Git revision SHA and not the name of a Git reference such as a branch or tag. As well, we replace the underscores in the argument names with hyphens to match the style we use in all our other manual pages for the names of command parameters.
In PR git-lfs#1870, which was based on the work in PR git-lfs#1822, we introduced support for "lockable" files whose permissions would be managed by several new Git LFS commands such as "git lfs post-checkout" and "git lfs post-merge", both of which we register as Git hooks. In commit 585befa of PRs git-lfs#1822 and git-lfs#1870 we added manual pages for these two commands. (Although these PRs also introduced the "git lfs post-commit" command, they did not include a manual page for it, so we added one belatedly in commit ae5d5b0 of PR git-lfs#4052.) From the time it was written, our git-lfs-post-merge(1) manual page has indicated that the command accepts a single "<is_squash>" argument. This accurately reflects the fact that Git will invoke our hook command with a single flag parameter that indicates whether or not a squash merge was performed. However, our "git lfs post-merge" command disregards this parameter entirely, so we adjust our manual page now to clarify that this command requires no arguments and ignores the one provided by Git.
This PR addresses the following locking user story points:
To this end, this PR adds the following features:
--lockableoption togit lfs track. This enables an additional attribute flaglockablewhich means files matching this pattern will be read-only when not locked.post-checkouthook to make files read-only when checked out if neededpost-commithook which makes added files read-only after they're committed. New files won't have been checked out but the locking rules should apply to them as soon as they enter git;post-commitis the event that needs to trigger that transitionpost-mergehook which checks & fixes the read-only flag, which can be reset during merge (post-checkoutunfortunately does not get called)On this basis we can be reasonably confident that files marked as
lockablewill be kept read-only in the working copy unless the user locks them, or overrides this flag themselves.There are a couple of caveats:
git reset <file>can override the read-only flag and there's no hook to catch it. However, the only reason to use that is to undo edits you've already made, so it's reasonable to assume you'd only have changes to undo if you'd already locked the file. Otherresetroutes involve a ref checkout so are caught by thepost-checkouthook.The behaviour can be disabled by setting env var
GIT_LFS_SET_LOCKABLE_READONLYto false, in case there are environments where you don't want this (e.g. automated scratch copies)Hope this all makes sense!
NB: This PR targets a new integration branch,
locking-master-2since I can't fast-forwardlocking-mastertomasterbecause of the review checks on it. I'd be fine with targetinglocking-masteras well if that could be brought forward to latestmaster, I just can't do that as I have no admin access.