Skip to content

Define lockable files, make read-only in working copy when not locked#1822

Merged
sinbad merged 32 commits intolocking-master-2from
locking-wip/lockable-read-only
Jan 16, 2017
Merged

Define lockable files, make read-only in working copy when not locked#1822
sinbad merged 32 commits intolocking-master-2from
locking-wip/lockable-read-only

Conversation

@sinbad
Copy link
Contributor

@sinbad sinbad commented Jan 2, 2017

This PR addresses the following locking user story points:

  1. We want to avoid accidentally editing unmergeable files which have not been locked first
  2. The best way to do that before any changes have been made is to make files read-only in the working copy, suggesting to the user that they shouldn't just edit them
  3. Not all LFS files are unmergeable. Therefore we want to identify which file patterns this rule should apply to
  4. Locking / unlocking a file should flip the read-only flag locally
  5. We need to maintain that read-only flag across other git operations

To this end, this PR adds the following features:

  1. Adds a --lockable option to git lfs track. This enables an additional attribute flag lockable which means files matching this pattern will be read-only when not locked.
  2. Adds a post-checkout hook to make files read-only when checked out if needed
  3. Adds a post-commit hook which makes added files read-only after they're committed. New files won't have been checked out but the locking rules should apply to them as soon as they enter git; post-commit is the event that needs to trigger that transition
  4. Adds a post-merge hook which checks & fixes the read-only flag, which can be reset during merge (post-checkout unfortunately does not get called)

On this basis we can be reasonably confident that files marked as lockable will be kept read-only in the working copy unless the user locks them, or overrides this flag themselves.

There are a couple of caveats:

  1. Some apps blatantly override read-only flags without asking. Some Apple tools do this. There's not a great deal we can do about this; this feature is mainly an advisory / reminder. Luckily it's not many tools.
  2. git reset <file> can override the read-only flag and there's no hook to catch it. However, the only reason to use that is to undo edits you've already made, so it's reasonable to assume you'd only have changes to undo if you'd already locked the file. Other reset routes involve a ref checkout so are caught by the post-checkout hook.

The behaviour can be disabled by setting env var GIT_LFS_SET_LOCKABLE_READONLY to false, in case there are environments where you don't want this (e.g. automated scratch copies)

Hope this all makes sense!

NB: This PR targets a new integration branch, locking-master-2 since I can't fast-forward locking-master to master because of the review checks on it. I'd be fine with targeting locking-master as well if that could be brought forward to latest master, I just can't do that as I have no admin access.

Also add functions to fix writeable flag based on these patterns
Also requires dropping recursive option which is fine since always recursive
@shana
Copy link

shana commented Jan 5, 2017

This is great! The fact that it's opt-in and per file/file type and can be globally disabled is excellent, it means that if a certain tool/workflow just breaks because of the readonly flags, it can be turned off while still being useful for tools that can take advantage of it.

Can we have some sort of "reset" command for git-lfs that restores readonly flags to all files that should have them (given env variables and lockable flag attributes) but for some reason don't? This would be useful for the scenario where a reset happens and a tool that is lfs-locking-aware detects it and wants to set the correct readonly states back. A tool would need to verify the environment, lockable configuration and locked files list to figure out which files need the flag reset. Would a git checkout [lfs file] restore it?

@sinbad
Copy link
Contributor Author

sinbad commented Jan 5, 2017

@shana glad you like it! I've based this on my own experience in game projects, hoping it pans out for others.

A git checkout will reset the readonly flags so I didn't think there was a need for a separate command (it's done by the post-checkout hook). If you checkout individual files then actually the whole repo has to be scanned, since git doesn't tell hooks what files were checked out (only when you checkout a ref can we optimise it to just files that changed between the refs, since we get that info). However the new repo file scanning tools are extremely fast so this shouldn't be an issue.

Copy link
Contributor

@technoweenie technoweenie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty cool. I think the track command seriously needs to be rewritten, but not in this PR :)

I also second @shana's request for a command to check/reset the readonly settings. It may make sense to add this to git lfs checkout, though that could overload its purpose (which right now, is to replace any LFS pointer files with the content from .git/lfs/objects).


// Skip this hook if no lockable patterns have been configured
if len(lockClient.GetLockablePatterns()) == 0 ||
!cfg.Os.Bool("GIT_LFS_SET_LOCKABLE_READONLY", true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense if this check is moved higher, so it can skip initializing the lock client if readonly files are disabled.

This environment variable controls whether files marked as 'lockable' in
`git lfs track` are made read-only in the working copy when not locked by
the current user. The default is `true`, you can disable this behaviour and
have all files writeable by setting this variable to 0, 'no' or 'false'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on adding a git config value? That would let users configure this per repo, if they wanted.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes indeed, good point, having granularity at the repo level on whether this is enabled or not would be very welcome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good shout, adding that now and a wrapper Configuration method to combine the two.

return paths
}

func findAttributeFiles(workingDir, gitDir string) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would returning an unbound channel here let this start parsing the .gitattributes contents while tools.FastWalkGitRepo is doing its thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this is not possible; see the end of the function where the order of the files is reversed to ensure the correct precedence of attributes is maintained.

// Always make non-nil even if empty
c.lockablePatterns = make([]string, 0, 10)

paths := git.GetAttributePaths(config.LocalWorkingDir, config.LocalGitDir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brings back the config import to the locking package. I'd prefer locking.Client get LocalWorkingDir and LocalGitDir properties, set in locking.NewClient(*config.Configuration).

The API refactor will move that to this commands helper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK; even after doing this it will still require the config package, until the per-package config comes in, but at least it puts it in one place ready for that I guess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with @technoweenie, let's keep this import light.

locking/locks.go Outdated

// Make non-writeable if required
if c.IsFileLockable(path) &&
config.Config.Os.Bool("GIT_LFS_SET_LOCKABLE_READONLY", true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like this to be a property on locking.Client too, set with the default value in locking.NewClient().

@technoweenie
Copy link
Contributor

One question: is it possible to add a lockable, but not LFS file? Should it be? I suppose the entry would just look something like: *.yml -text lockable

@shana
Copy link

shana commented Jan 5, 2017

One question: is it possible to add a lockable, but not LFS file? Should it be? I suppose the entry would just look something like: *.yml -text lockable

So this would be really interesting to have, for this: There are files that are technically hard to merge by git or by humans (due to unpredictable serialization behaviours requiring smart merge tools that can be brittle), but are not necessarily binary files (or they are binary files but very small ones). Putting these in LFS for the purposes of having a locking feature for them adds a lot of overhead that isn't really needed, and in the case of text files that are just hard to merge, is actual a detriment to the process (turning a text file into a binary blob unnecessarily).

@sinbad
Copy link
Contributor Author

sinbad commented Jan 6, 2017

One question: is it possible to add a lockable, but not LFS file? Should it be?

It's not possible at the moment. I think you could probably find some cases where this would be useful but they'd be a minority; most of the time it's binary files that you'd want in LFS anyway I think. Adding this as an orthogonal concept would make things more complex (and may even suggest making lockable a separate tool to git lfs) and I don't think that's justified.

@sinbad
Copy link
Contributor Author

sinbad commented Jan 6, 2017

I also second @shana's request for a command to check/reset the readonly settings. It may make sense to add this to git lfs checkout, though that could overload its purpose (which right now, is to replace any LFS pointer files with the content from .git/lfs/objects).

I mentioned you can trigger this via git checkout already but my comment may have overlapped with yours. Do you still think it needs a separate command?

Copy link
Contributor

@ttaylorr ttaylorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. I left a few comments that I'd love your thoughts on, and then I'll give this another review.

git/attribs.go Outdated
if reldir := filepath.Dir(relfile); len(reldir) > 0 {
pattern = filepath.Join(reldir, pattern)
}
lockable := strings.Contains(line, LockableAttrib)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function looks good, but I'm thinking that this line would lock a file named "lockable" (or one that contains the substring "lockable") as lockable on disk, even if the user didn't want it to be. This is sort of edgecase-y, but I think it's still behavior worth checking. Maybe a more proper .giattributes parser would be worth investigating?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, if the file entry in .gitattributes was specific enough that could be a problem - it's not likely but could be a nasty edge case in that 0.0001% and experience says that's worth addressing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out there's a cheap way to do it, we're already splitting on whitespace so just need to check fields[1:] instead. .gitattributes always splits on plain whitespace (you have to use [[:space:]] in file patterns) so this works.

errors/errors.go Outdated
}

func Combine(errs []error) error {
if len(errs) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd prefer a smaller if statement here:

if len(errs) == 0 {
      return nil
}

// ...

It'd be nice if we didn't need the if statement at all, but I think this is at least a small step forward 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I submit PRs to projects with opposing code styles for this case and sometimes I forget 😄

// marked as lockable
func (c *Client) GetLockablePatterns() []string {
c.ensureLockablesLoaded()
return c.lockablePatterns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This read isn't guarded by a mutex-lock. I think using a sync.RWMutex would be appropriate here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A read-only mutex here wouldn't do anything useful unless the caller also locked it.

ensureLockablesLoaded ensures that c.lockablePatterns is only updated by one thread, and refreshLockablePatterns deliberately replaces the entire map (rather than clearing and re-populating it) so that previous callers to GetLockablePatterns can continue to use the old map. Therefore a read-only mutex is not required provided the return from GetLockablePatterns is not modified. Unfortunately Go doesn't have a const marker for returns, I thought about copying the map each time but ultimately thought it wasn't worth it.

// Always make non-nil even if empty
c.lockablePatterns = make([]string, 0, 10)

paths := git.GetAttributePaths(config.LocalWorkingDir, config.LocalGitDir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with @technoweenie, let's keep this import light.

// You must have locked the c.lockableMutex in the caller
func (c *Client) refreshLockablePatterns() {
// Always make non-nil even if empty
c.lockablePatterns = make([]string, 0, 10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be initialized after calling git.GetAttributePaths to avoid resizing:

paths := git.GetAttributePaths(...)
c.lockablePatterns = make([]string, 0, len(paths))

for _, p := range paths {
        if p.Lockable {
                c.lockablePatterns = append(c.lockablePatterns, p.Path)
        }
}
// ...

locking/locks.go Outdated

return lock, nil
// Ensure writeable on return
return lock, tools.SetFileWriteFlag(path, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike the pattern of returning a single error via a method call in the return signature, because I think this hides too much. I prefer:

if err := tools.SetFileWriteFlag(path, true); err != nil {
        return Lock{}, err
}

return lock, nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure I've had the opposite feedback before 😆 Fair enough.

lfs/hook.go Outdated
func NewStandardHook(theType string, upgradeables []string) *Hook {
return &Hook{
Type: theType,
Contents: strings.Replace(hookBaseContent, "__CMD__", theType, -1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to use a text/template instead of string replacement, since the later is a common Go convention and is well understood by looking at the hookBaseContent variable alone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a lot of cruft for what will boil down to a global find/replace even with a template. If you like I can make the string look like a template and still use strings.Replace but I see no reason to clutter this with all the init/struct param/execute patterns of templates.

//
// This hook checks that files which are lockable and not locked are made read-only,
// optimising that as best it can based on the available information.
func postCheckoutCommand(cmd *cobra.Command, args []string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command needs a manpage entry.

LoggedError(err, "Warning: post-checkout rev diff %v:%v failed: %v\nFalling back on full scan.", pre, post, err)
postCheckoutFileChange(client)
}
err = client.FixLockableFileWriteFlags(files)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice 👍

//
// This hook checks that files which are lockable and not locked are made read-only,
// optimising that as best it can based on the available information.
func postMergeCommand(cmd *cobra.Command, args []string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command needs a manpage entry.

This comment was marked as spam.

@ttaylorr
Copy link
Contributor

ttaylorr commented Jan 10, 2017 via email

@ttaylorr
Copy link
Contributor

ttaylorr commented Jan 10, 2017 via email

@ttaylorr
Copy link
Contributor

ttaylorr commented Jan 10, 2017 via email

@ttaylorr
Copy link
Contributor

ttaylorr commented Jan 10, 2017 via email

@ttaylorr
Copy link
Contributor

ttaylorr commented Jan 10, 2017 via email

Still using strings.Replace, this doesn’t justify using text/template
@sinbad sinbad merged commit ab839a4 into locking-master-2 Jan 16, 2017
@sinbad sinbad deleted the locking-wip/lockable-read-only branch January 16, 2017 11:33
setting, control whether files marked as 'lockable' in `git lfs track` are
made read-only in the working copy when not locked by the current user.
The default is `true`; you can disable this behaviour and have all files
writeable by setting either variable to 0, 'no' or 'false'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ttaylorr If the lockable feature is on, then Git LFS cannot skip the post-checkout command. Consequently, Git LFS needs to search for all .gitattributes files to check them for lockable patterns.

My team is working with a large repo (>20k directories, >300k files) and just searching for the .gitattributes file in all directories takes ~6 sec on macOS with a fast SSD. Consider these timings:

$ export GIT_LFS_SET_LOCKABLE_READONLY=1; time git-lfs post-checkout a0c5174a19f2dece44c09bc76f91feb69799e66f a0c5174a19f2dece44c09bc76f91feb69799e66f 0
git-lfs post-checkout a0c5174a19f2dece44c09bc76f91feb69799e66f  0  

35.09s user 6.41s system 730% cpu 5.679 total


$ export GIT_LFS_SET_LOCKABLE_READONLY=0; time git-lfs post-checkout a0c5174a19f2dece44c09bc76f91feb69799e66f a0c5174a19f2dece44c09bc76f91feb69799e66f 0
git-lfs post-checkout a0c5174a19f2dece44c09bc76f91feb69799e66f  0  

0.01s user 0.02s system 89% cpu 0.030 total

I plan to distribute the setting lfs.setlockablereadonly=false in my company gitconfig but it took me a while to figure out the culprit of the slowness. I don't know how widely the lockable feature is used, but I wonder if lfs.setlockablereadonly=falseis a more sensible default? Alternatively we could find ways to speed up the .gitattributes discovery.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how widely the lockable feature is used, but I wonder if lfs.setlockablereadonly=false is a more sensible default?

I think =true is an important default value to keep. Here are a few thoughts on how we can get the best of both worlds:

  1. Speed up the time it takes to search for .gitattributes. This has been optimized by @sinbad in Fast directory walk #1616. I think that this work could be augmented by first stat()-ing directories to search, and then seeing if the stat is out-of-date with the last time that the directory was traversed, effectively memoizing the search. I presume that this would improve the speed a great deal.

  2. Warn when the search is taking too long. We can set a timer (similar to what Microsoft does with GVFS when the --ahead-behind check takes too long) and warn if Git LFS is unable to determine the set of lockable files quickly.

What do you think? Perhaps we should have this discussion in a new issue, as well.

chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Apr 14, 2022
In commit d91980f of
PR git-lfs#1822 the GetFilesChanged() function was updated to use
"git diff-tree" rather than "git diff".  However, the
relevant error messages were not updated at the same time,
so we do that now.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request May 15, 2025
In PR git-lfs#1870, which was based on the work in PR git-lfs#1822, we introduced
support for "lockable" files whose permissions would be managed by
several new Git LFS commands such as "git lfs post-checkout" and
"git lfs post-merge", both of which we register as Git hooks.

In commit 585befa of PRs git-lfs#1822 and git-lfs#1870
we added manual pages for these two commands.  (Although these PRs
also introduced the "git lfs post-commit" command, they did not
include a manual page for it, so we added one belatedly in commit
ae5d5b0 of PR git-lfs#4052.)

From the time it was written, our git-lfs-post-checkout(1) manual page has
indicated that the command accepts three arguments.  This accurately
reflects the fact that Git will invoke our hook command with parameters
specifying the SHAs from before and after the checkout operation, plus
a status flag indicating whether a branch checkout was performed or not.

However, our manual page does not explain what values these arguments
are expected to contain, other than by dint of the names given to the
arguments in the page's synopsis section.  We therefore expand our
description of the "git lfs post-checkout" command to detail the values
Git will provide in each of the three arguments.

We also take the opportunity to adjust the name of the second argument
to clarify that we expect this parameter to contain a Git revision SHA
and not the name of a Git reference such as a branch or tag.  As well,
we replace the underscores in the argument names with hyphens to match
the style we use in all our other manual pages for the names of command
parameters.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request May 15, 2025
In PR git-lfs#1870, which was based on the work in PR git-lfs#1822, we introduced
support for "lockable" files whose permissions would be managed by
several new Git LFS commands such as "git lfs post-checkout" and
"git lfs post-merge", both of which we register as Git hooks.

In commit 585befa of PRs git-lfs#1822 and git-lfs#1870
we added manual pages for these two commands.  (Although these PRs
also introduced the "git lfs post-commit" command, they did not
include a manual page for it, so we added one belatedly in commit
ae5d5b0 of PR git-lfs#4052.)

From the time it was written, our git-lfs-post-merge(1) manual page has
indicated that the command accepts a single "<is_squash>" argument.
This accurately reflects the fact that Git will invoke our hook command
with a single flag parameter that indicates whether or not a squash merge
was performed.

However, our "git lfs post-merge" command disregards this parameter
entirely, so we adjust our manual page now to clarify that this command
requires no arguments and ignores the one provided by Git.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants