Skip to content

locks: manage write permissions of ignored files#3190

Merged
ttaylorr merged 3 commits into
masterfrom
ttaylorr/lock-gitignored-files
Aug 23, 2018
Merged

locks: manage write permissions of ignored files#3190
ttaylorr merged 3 commits into
masterfrom
ttaylorr/lock-gitignored-files

Conversation

@ttaylorr

Copy link
Copy Markdown
Contributor

This pull request teaches git lfs lock to continue to manage lockable files after they have been ignored by a repository's .gitignore.

The rationale for this issue arrises from #3183, and is explained further in a727fea (locking: remove write permission for ignored files, 2018-08-20).

$ printf "a.txt" > a.txt
$ git lfs track --lockable "*.txt"
Tracking "*.txt"
$ git add .gitattributes a.txt
$ git commit -m ".gitattributes: track 'a.txt' as lockable"
[master (root-commit) e05e353] .gitattributes: track 'a.txt' as lockable
 2 files changed, 4 insertions(+)
 create mode 100644 .gitattributes
 create mode 100644 a.txt
$ rm -f a.txt && git checkout a.txt
$ ls -l a.txt
-r--r--r--  1 ttaylorr  staff  5 Aug 20 07:53 a.txt
$ echo "a.txt" > .gitignore
$ git add .gitignore
$ git commit -m ".gitignore: ignore 'a.txt'"
[master e658686] .gitignore: ignore 'a.txt'
 1 file changed, 1 insertion(+)
 create mode 100644 .gitignore
$ rm -f a.txt && git checkout a.txt
$ ls -l a.txt
-r--r--r--  1 ttaylorr  staff  5 Aug 20 07:53 a.txt

Closes: #3183.

/cc @git-lfs/core
/cc @zach-r-d #3183

When calling FastWalkGitRepo, the implementation constructs an
instance of the *fastWalker type, and then consumes items from the
<-chan until it receives a close().

This was left mostly unchanged since 3b9e629 (use a single channel,
2016-11-29), when we had only the single function `FastWalkGitRepo`.

In a subsequent commit, we will introduce a new function
`FastWalkGitRepoAll` with does _not_ exclude patterns found in a
repository's .gitignore(s).

In preparation for this function, which we assume to share the
implementation of `FastWalkGitRepo` in its entirety (modulo the
construction of a `*fastWalker`), let's extract this fastWalkCallback
function to avoid repeating ourselves.
Certain callers would like a filepath.Walk-esque interface that offers
the same parallelized advantages as FastWalkGitRepo, but does not skip
traversing into `.gitignore`-ed files and directories like
FastWalkGitRepo.

FastWalkGitRepoAll is that interface. According to its godoc comment, it
behaves exactly as FastWalkGitRepo, but does not skip `.gitingore`-ed
files or directories.

The fastWalkWithExcludeFiles (used in conjunction with c9cde11
(tools/filetools.go: extract fastWalker consumer func, 2018-08-20))
accepts an empty-string second argument to indicate that no .gitignore
files should be ignored.

Similarly, let's test this by leveraging the same test used for
FastWalkGitRepo, to avoid having to extract a complicated function that
sets up the directory structure for both. We'll use a new function
(`uniq()`) to avoid some complications resulting from appending all
ignored files into the slice of all accepted files.
When a file exists in a user's working copy that has been previously
marked as lockable and has since become ignored, the file's write
permission would no longer be managed by Git LFS.

Since 'git lfs lock' corresponds to pathname only (and thus is not
affected by Git-specific concepts such as .gitignore), let's continue to
manage those files' write permissions by using tools.FastWalkGitRepoAll
instead of tools.FastWalkGitRepo.

@PastelMobileSuit PastelMobileSuit left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@ttaylorr ttaylorr merged commit 87e2474 into master Aug 23, 2018
@ttaylorr ttaylorr deleted the ttaylorr/lock-gitignored-files branch August 23, 2018 21:18
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Mar 16, 2026
In commit 13a8af6 of PR git-lfs#1616 we
added a FastWalkGitRepo() function to our "tools" package for the
purpose of improving the performance of the "git lfs track" command.
At the time, this command used the Walk() function of the "filepath"
package from the Go standard library to traverse the contents of the
current Git working tree and locate all ".gitattributes" files.

Then in commit f1fdc85 of the same
PR git-lfs#1616 we updated the "git lfs track" command's findAttributeFiles()
function to use our new FastWalkGitRepo() function instead of the
"filepath" package's Walk() function.  This change made searches for
".gitattributes" files in large repositories faster for several reasons.
Unlike the Walk() function from the "filepath" package, the functions
called by the FastWalkGitRepo() function to traverse a directory
hierarchy did not sort the entries in each directory, and also ignored
all ".git" directories and all entries which matched any pattern found
in a ".gitignore" file.

Later, in PRs git-lfs#1870 and git-lfs#2689, we expanded the number of callers of
the FastWalkGitRepo() function.  In particular, in PR git-lfs#2689 we began
to make use of the function during the final phase of all Git LFS
commands where we find and delete any stale temporary files stored in
our ".lfs/tmp" directory.  This PR introduced our "fs" package whose
cleanupTmp() method calls the FastWalkGitRepo() function, passing
the path to the ".lfs/tmp" directory and an anonymous callback function
which removes any temporary object data files that are more than an
hour old.

We then added a FastWalkGitRepoAll() function to our "fs" package in
PR git-lfs#3190, which operated in a similar fashion as the FastWalkGitRepo()
function but did not read ".gitignore" files and so also did not skip
directory entries matching any patterns found in those files.

Next, in PR git-lfs#3686, we updated the internal implementation of the
FastWalkGitRepo() and FastWalkGitRepoAll() functions to avoid entering
submodules when traversing a Git working tree.  To make this change,
we added a check to the Walk() method of the "fastWalker" structure
so it would return immediately when processing a directory if the
directory contained an entry named ".git", unless the directory was
the root of the working tree.  Note that the Walk() method already
ignored any directory entries with the name ".git", but this only
meant it would traverse through the contents of a submodule checkout
in a working tree while skipping the submodule's ".git" directory.

Then in commit 83d7f76 of PR git-lfs#3823
we first implemented the NewLsFiles() function in our "git" package,
which executes a "git ls-files" command and returns list of files it
outputs.  As well, we updated the findAttributeFiles() function of our
"git" package and the fixFileWriteFlags() function in our "locking"
package to both call the NewLsFiles() function instead of calling
either the FastWalkGitRepo() or FastWalkGitRepoAll() functions.

Since these were the only instances where we actually used the
FastWalkGitRepo() or FastWalkGitRepoAll() functions to traverse a Git
working tree, in the same commit of PR git-lfs#3823 we removed the
FastWalkGitRepo() function and renamed the FastWalkGitRepoAll() function
to FastWalkDir().  We also simplified some of the internal functions
called by the FastWalkDir() function because they no longer needed to
detect or parse ".gitignore" files, or skip directory entries matching
the patterns from those files.

Although the two remaining use cases for the FastWalkDir() function
also did not need the function to detect and skip submodules or
directories named ".git", the logic to do so was retained in the
internal functions of the "tools" package.

Specifically, the fastWalkWithExcludeFiles() function, which is called
by the FastWalkDir() function, establishes two file path filters with
the patterns ".git" and "**/.git", which are then passed down to the
Walk() method of the "fastWalker" structure.  That method checks whether
the current directory entry's name matches either of the filter patterns,
and if it does returns immediately without processing the entry any
further.  In addition, the Walk() method still also performs the check
added in PR git-lfs#3686 to try to avoid traversing into Git submodules.

As noted above, however, neither of the two remaining callers of the
FastWalkDir() function require these checks, because they only need
the function to traverse directory hierarchies within the ".git/lfs"
directory.

The cleanupTmp() method of the Filesystem structure in our "fs" package
uses the FastWalkDir() function to find stale temporary files within the
".git/lfs/tmp" directory.  The EachObject() method of the same structure,
meanwhile, uses the FastWalkDir() function to invoke a callback function
for each object file under the ".git/lfs/objects" directory.

In both cases, the directory hierarchy traversed by the FastWalkDir()
function is entirely within the ".git/lfs" directory, so there is no
value in trying to exclude ".git" directories while performing the
traversal.

We therefore now simplify the FastWalkDir() function and the internal
functions it invokes by removing the unnecessary checks for submodules
and for directory entries named ".git".

First, we update the fastWalkWithExcludeFiles() function so that does
not initialize any file path filters, and we rename the function to
fastWalkDir().

Next, we remove the Walk() method's "excludePaths" parameter, and alter
the method so that it no longer skips directory entries based on whether
their names match the patterns in that parameter's file path filters.

We also eliminate the check in the Walk() method for directories
containing ".git" directory entries, as this check's only purpose was
to skip Git submodules within a working tree.

As well, we revise the code comments relating to all of these functions
and methods to reflect their new names and their simplified behaviours.
To minimize the changes in this commit, however, we leave the names of
the functions' parameters and internal variables intact, even though
some of them still reflect the original design and the expectation that
the functions would be used with Git working trees.  In a subsequent
commit in this PR we will then rename these variables and parameters,
along with the "rootDir" field of the "fastWalker" structure, so that
they more accurately represent the functions' current purpose and
implementation.

Finally, note that the "cleans only temp files and directories older
than an hour" test in our "t/t-tempfile.sh" shell test script verifies
the behaviour of cleanupTmp() function, which employs the FastWalkDir()
function, while the TestFastWalkBasic() test function in our Go test
suite directly exercises the fastWalkDir() internal function.  Both
tests thus provide some assurance that our changes in this commit have
not introduced any unexpected regressions.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Mar 16, 2026
In commit 13a8af6 of PR git-lfs#1616 we
added a FastWalkGitRepo() function to our "tools" package for the
purpose of improving the performance of the "git lfs track" command.
At the time, this command used the Walk() function of the "filepath"
package from the Go standard library to traverse the contents of the
current Git working tree and locate all ".gitattributes" files.

Then in commit f1fdc85 of the same
PR git-lfs#1616 we updated the "git lfs track" command's findAttributeFiles()
function to use our new FastWalkGitRepo() function instead of the
"filepath" package's Walk() function.  This change made searches for
".gitattributes" files in large repositories faster for several reasons.
Unlike the Walk() function from the "filepath" package, the functions
called by the FastWalkGitRepo() function to traverse a directory
hierarchy did not sort the entries in each directory, and also ignored
all ".git" directories and all entries which matched any pattern found
in a ".gitignore" file.

Later, in PRs git-lfs#1870 and git-lfs#2689, we expanded the number of callers of
the FastWalkGitRepo() function.  In particular, in PR git-lfs#2689 we began
to make use of the function during the final phase of all Git LFS
commands where we find and delete any stale temporary files stored in
our ".lfs/tmp" directory.  This PR introduced our "fs" package whose
cleanupTmp() method calls the FastWalkGitRepo() function, passing
the path to the ".lfs/tmp" directory and an anonymous callback function
which removes any temporary object data files that are more than an
hour old.

We then added a FastWalkGitRepoAll() function to our "fs" package in
PR git-lfs#3190, which operated in a similar fashion as the FastWalkGitRepo()
function but did not read ".gitignore" files and so also did not skip
directory entries matching any patterns found in those files.

Next, in PR git-lfs#3686, we updated the internal implementation of the
FastWalkGitRepo() and FastWalkGitRepoAll() functions to avoid entering
submodules when traversing a Git working tree.  To make this change,
we added a check to the Walk() method of the "fastWalker" structure
so it would return immediately when processing a directory if the
directory contained an entry named ".git", unless the directory was
the root of the working tree.  Note that the Walk() method already
ignored any directory entries with the name ".git", but this only
meant it would traverse through the contents of a submodule checkout
in a working tree while skipping the submodule's ".git" directory.

Then in commit 83d7f76 of PR git-lfs#3823
we first implemented the NewLsFiles() function in our "git" package,
which executes a "git ls-files" command and returns list of files it
outputs.  As well, we updated the findAttributeFiles() function of our
"git" package and the fixFileWriteFlags() function in our "locking"
package to both call the NewLsFiles() function instead of calling
either the FastWalkGitRepo() or FastWalkGitRepoAll() functions.

Since these were the only instances where we actually used the
FastWalkGitRepo() or FastWalkGitRepoAll() functions to traverse a Git
working tree, in the same commit of PR git-lfs#3823 we removed the
FastWalkGitRepo() function and renamed the FastWalkGitRepoAll() function
to FastWalkDir().  We also simplified some of the internal functions
called by the FastWalkDir() function because they no longer needed to
detect or parse ".gitignore" files, or skip directory entries matching
the patterns from those files.

Although the two remaining use cases for the FastWalkDir() function
also did not need the function to detect and skip submodules or
directories named ".git", the logic to do so was retained in the
internal functions of the "tools" package.

Specifically, the fastWalkWithExcludeFiles() function, which is called
by the FastWalkDir() function, establishes two file path filters with
the patterns ".git" and "**/.git", which are then passed down to the
Walk() method of the "fastWalker" structure.  That method checks whether
the current directory entry's name matches either of the filter patterns,
and if it does returns immediately without processing the entry any
further.  In addition, the Walk() method still also performs the check
added in PR git-lfs#3686 to try to avoid traversing into Git submodules.

As noted above, however, neither of the two remaining callers of the
FastWalkDir() function require these checks, because they only need
the function to traverse directory hierarchies within the ".git/lfs"
directory.

The cleanupTmp() method of the Filesystem structure in our "fs" package
uses the FastWalkDir() function to find stale temporary files within the
".git/lfs/tmp" directory.  The EachObject() method of the same structure,
meanwhile, uses the FastWalkDir() function to invoke a callback function
for each object file under the ".git/lfs/objects" directory.

In both cases, the directory hierarchy traversed by the FastWalkDir()
function is entirely within the ".git/lfs" directory, so there is no
value in trying to exclude ".git" directories while performing the
traversal.

We therefore now simplify the FastWalkDir() function and the internal
functions it invokes by removing the unnecessary checks for submodules
and for directory entries named ".git".

First, we update the fastWalkWithExcludeFiles() function so that does
not initialize any file path filters, and we rename the function to
fastWalkDir().

Next, we remove the Walk() method's "excludePaths" parameter, and alter
the method so that it no longer skips directory entries based on whether
their names match the patterns in that parameter's file path filters.

We also eliminate the check in the Walk() method for directories
containing ".git" directory entries, as this check's only purpose was
to skip Git submodules within a working tree.

As well, we revise the code comments relating to all of these functions
and methods to reflect their new names and their simplified behaviours.
To minimize the changes in this commit, however, we leave the names of
the functions' parameters and internal variables intact, even though
some of them still reflect the original design and the expectation that
the functions would be used with Git working trees.  In a subsequent
commit in this PR we will then rename these variables and parameters,
along with the "rootDir" field of the "fastWalker" structure, so that
they more accurately represent the functions' current purpose and
implementation.

Finally, note that the "cleans only temp files and directories older
than an hour" test in our "t/t-tempfile.sh" shell test script verifies
the behaviour of cleanupTmp() function, which employs the FastWalkDir()
function, while the TestFastWalkBasic() test function in our Go test
suite directly exercises the fastWalkDir() internal function.  Both
tests thus provide some assurance that our changes in this commit have
not introduced any unexpected regressions.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Mar 17, 2026
In commit 13a8af6 of PR git-lfs#1616 we
added a FastWalkGitRepo() function to our "tools" package for the
purpose of improving the performance of the "git lfs track" command.
At the time, this command used the Walk() function of the "filepath"
package from the Go standard library to traverse the contents of the
current Git working tree and locate all ".gitattributes" files.

Then in commit f1fdc85 of the same
PR git-lfs#1616 we updated the "git lfs track" command's findAttributeFiles()
function to use our new FastWalkGitRepo() function instead of the
"filepath" package's Walk() function.  This change made searches for
".gitattributes" files in large repositories faster for several reasons.
Unlike the Walk() function from the "filepath" package, the functions
called by the FastWalkGitRepo() function to traverse a directory
hierarchy did not sort the entries in each directory, and also ignored
all ".git" directories and all entries which matched any pattern found
in a ".gitignore" file.

Later, in PRs git-lfs#1870 and git-lfs#2689, we expanded the number of callers of
the FastWalkGitRepo() function.  In particular, in PR git-lfs#2689 we began
to make use of the function during the final phase of all Git LFS
commands where we find and delete any stale temporary files stored in
our ".lfs/tmp" directory.  This PR introduced our "fs" package whose
cleanupTmp() method calls the FastWalkGitRepo() function, passing
the path to the ".lfs/tmp" directory and an anonymous callback function
which removes any temporary object data files that are more than an
hour old.

We then added a FastWalkGitRepoAll() function to our "fs" package in
PR git-lfs#3190, which operated in a similar fashion as the FastWalkGitRepo()
function but did not read ".gitignore" files and so also did not skip
directory entries matching any patterns found in those files.

Next, in PR git-lfs#3686, we updated the internal implementation of the
FastWalkGitRepo() and FastWalkGitRepoAll() functions to avoid entering
submodules when traversing a Git working tree.  To make this change,
we added a check to the Walk() method of the "fastWalker" structure
so it would return immediately when processing a directory if the
directory contained an entry named ".git", unless the directory was
the root of the working tree.  Note that the Walk() method already
ignored any directory entries with the name ".git", but this only
meant it would traverse through the contents of a submodule checkout
in a working tree while skipping the submodule's ".git" directory.

Then in commit 83d7f76 of PR git-lfs#3823
we first implemented the NewLsFiles() function in our "git" package,
which executes a "git ls-files" command and returns list of files it
outputs.  As well, we updated the findAttributeFiles() function of our
"git" package and the fixFileWriteFlags() function in our "locking"
package to both call the NewLsFiles() function instead of calling
either the FastWalkGitRepo() or FastWalkGitRepoAll() functions.

Since these were the only instances where we actually used the
FastWalkGitRepo() or FastWalkGitRepoAll() functions to traverse a Git
working tree, in the same commit of PR git-lfs#3823 we removed the
FastWalkGitRepo() function and renamed the FastWalkGitRepoAll() function
to FastWalkDir().  We also simplified some of the internal functions
called by the FastWalkDir() function because they no longer needed to
detect or parse ".gitignore" files, or skip directory entries matching
the patterns from those files.

Although the two remaining use cases for the FastWalkDir() function
also did not need the function to detect and skip submodules or
directories named ".git", the logic to do so was retained in the
internal functions of the "tools" package.

Specifically, the fastWalkWithExcludeFiles() function, which is called
by the FastWalkDir() function, establishes two file path filters with
the patterns ".git" and "**/.git", which are then passed down to the
Walk() method of the "fastWalker" structure.  That method checks whether
the current directory entry's name matches either of the filter patterns,
and if it does returns immediately without processing the entry any
further.  In addition, the Walk() method still also performs the check
added in PR git-lfs#3686 to try to avoid traversing into Git submodules.

As noted above, however, neither of the two remaining callers of the
FastWalkDir() function require these checks, because they only need
the function to traverse directory hierarchies within the ".git/lfs"
directory.

The cleanupTmp() method of the Filesystem structure in our "fs" package
uses the FastWalkDir() function to find stale temporary files within the
".git/lfs/tmp" directory.  The EachObject() method of the same structure,
meanwhile, uses the FastWalkDir() function to invoke a callback function
for each object file under the ".git/lfs/objects" directory.

In both cases, the directory hierarchy traversed by the FastWalkDir()
function is entirely within the ".git/lfs" directory, so there is no
value in trying to exclude ".git" directories while performing the
traversal.

We therefore now simplify the FastWalkDir() function and the internal
functions it invokes by removing the unnecessary checks for submodules
and for directory entries named ".git".

First, we update the fastWalkWithExcludeFiles() function so that does
not initialize any file path filters, and we rename the function to
fastWalkDir().

Next, we remove the Walk() method's "excludePaths" parameter, and alter
the method so that it no longer skips directory entries based on whether
their names match the patterns in that parameter's file path filters.

We also eliminate the check in the Walk() method for directories
containing ".git" directory entries, as this check's only purpose was
to skip Git submodules within a working tree.

As well, we revise the code comments relating to all of these functions
and methods to reflect their new names and their simplified behaviours.
To minimize the changes in this commit, however, we leave the names of
the functions' parameters and internal variables intact, even though
some of them still reflect the original design and the expectation that
the functions would be used with Git working trees.  In a subsequent
commit in this PR we will then rename these variables and parameters,
along with the "rootDir" field of the "fastWalker" structure, so that
they more accurately represent the functions' current purpose and
implementation.

Finally, note that the "cleans only temp files and directories older
than an hour" test in our "t/t-tempfile.sh" shell test script verifies
the behaviour of cleanupTmp() function, which employs the FastWalkDir()
function, while the TestFastWalkBasic() test function in our Go test
suite directly exercises the fastWalkDir() internal function.  Both
tests thus provide some assurance that our changes in this commit have
not introduced any unexpected regressions.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Mar 17, 2026
In commit 13a8af6 of PR git-lfs#1616 we
added a FastWalkGitRepo() function to our "tools" package for the
purpose of improving the performance of the "git lfs track" command.
At the time, this command used the Walk() function of the "filepath"
package from the Go standard library to traverse the contents of the
current Git working tree and locate all ".gitattributes" files.

Then in commit f1fdc85 of the same
PR git-lfs#1616 we updated the "git lfs track" command's findAttributeFiles()
function to use our new FastWalkGitRepo() function instead of the
"filepath" package's Walk() function.  This change made searches for
".gitattributes" files in large repositories faster for several reasons.
Unlike the Walk() function from the "filepath" package, the functions
called by the FastWalkGitRepo() function to traverse a directory
hierarchy did not sort the entries in each directory, and also ignored
all ".git" directories and all entries which matched any pattern found
in a ".gitignore" file.

Later, in PRs git-lfs#1870 and git-lfs#2689, we expanded the number of callers of
the FastWalkGitRepo() function.  In particular, in PR git-lfs#2689 we began
to make use of the function in our "git lfs prune" command to locate
all the object files in our local storage directories, which by default
are located within the ".git/lfs/objects" directory.  As well, with
this PR we began to use the FastWalkGitRepo() function during the
final phase of all Git LFS commands where we find and delete any
stale temporary files from the directory where we store such files,
which by default is the ".git/lfs/tmp" directory.

PR git-lfs#2689 introduced our "fs" package, whose Filesystem structure's
EachObject() and cleanupTmp() methods both called the FastWalkGitRepo()
function, passing the paths to the local object storage directory
and the local temporary file storage directory, respectively.

We then added a FastWalkGitRepoAll() function to our "fs" package in
PR git-lfs#3190.  This function operated in a similar fashion as the
FastWalkGitRepo() function but did not read ".gitignore" files and so
also did not skip directory entries matching any patterns found in
those files.

Next, in PR git-lfs#3686, we updated the internal implementation of the
FastWalkGitRepo() and FastWalkGitRepoAll() functions to avoid entering
submodules when traversing a Git working tree.  To make this change,
we added a check to the "fastWalker" structure's Walk() method so it
would return immediately when processing a directory if the directory
contained an entry named ".git", unless the directory was the root of
the working tree.  Note that the Walk() method already ignored any
directory entries with the name ".git", but this only meant it would
traverse through the contents of a submodule checkout in a working
tree while skipping the submodule's ".git" directory.

Then in commit 83d7f76 of PR git-lfs#3823
we first implemented the NewLsFiles() function in our "git" package,
which executes a "git ls-files" command and returns the list of files
it outputs.  As well, we updated the findAttributeFiles() function of
our "git" package and the fixFileWriteFlags() function in our "locking"
package to both call the NewLsFiles() function instead of calling
either the FastWalkGitRepo() or FastWalkGitRepoAll() functions.

Since these were the only instances where we actually used the
FastWalkGitRepo() or FastWalkGitRepoAll() functions to traverse a Git
working tree, in the same commit of PR git-lfs#3823 we removed the
FastWalkGitRepo() function and renamed the FastWalkGitRepoAll() function
to FastWalkDir().  We also simplified some of the internal functions
called by the FastWalkDir() function because they no longer needed to
detect or parse ".gitignore" files, or skip directory entries matching
the patterns from those files.

Although the two remaining use cases for the FastWalkDir() function
also did not need the function to detect and skip submodules or
directories named ".git", the logic to do so was retained in the
internal functions of the "tools" package.

Specifically, the fastWalkWithExcludeFiles() function, which is called
by the FastWalkDir() function, establishes two file path filters with
the patterns ".git" and "**/.git", which are then passed down to the
Walk() method of the "fastWalker" structure.  That method checks whether
the current directory entry's name matches either of the filter patterns,
and if it does returns immediately without processing the entry any
further.  In addition, the Walk() method still also performs the check
added in PR git-lfs#3686 to try to avoid traversing into Git submodules.

As noted above, however, neither of the two remaining callers of the
FastWalkDir() function require these checks, because they only need
the function to traverse directory hierarchies which the Git LFS client
has created, and which are specifically not Git working trees.

The cleanupTmp() method of the Filesystem structure in our "fs" package
uses the FastWalkDir() function to find stale files within the local
temporary file storage directory.  The EachObject() method of the same
structure, meanwhile, uses the FastWalkDir() function to invoke a
callback function for each object file in the local object storage
directory.

By default, these local storage directories are the ".git/lfs/tmp" and
".git/lfs/objects" directories.  If the "lfs.storage" configuration
option is set to a relative path, then these directories will be
located somewhere within the ".git" directory, while if the option is
set to an absolute path, our local storage directories will be located
under that arbitrary location.

In all these cases the Git LFS client creates and manages these local
storage directories, so we can expect them not to contain ".git"
directories or submodules.  This is true even when a user has
configured the "lfs.storage" option with an absolute path, since
the client is still responsible for creating and managing "tmp" and
"objects" directories within that arbitrary location.

We therefore now simplify the FastWalkDir() function and the internal
functions it invokes by removing the unnecessary checks for submodules
and for directory entries named ".git".

First, we update the fastWalkWithExcludeFiles() function so that does
not initialize any file path filters, and we rename the function to
fastWalkDir().

Next, we remove the Walk() method's "excludePaths" parameter, and alter
the method so that it no longer skips directory entries based on whether
their names match the patterns in that parameter's file path filters.

We also eliminate the check in the Walk() method for directories
containing ".git" directory entries, as this check's only purpose was
to skip Git submodules within a working tree.

As well, we revise the code comments relating to all of these functions
and methods to reflect their new names and their simplified behaviours.
To minimize the changes in this commit, however, we leave the names of
the functions' parameters and internal variables intact, even though
some of them still reflect the original design and the expectation that
the functions would be used with Git working trees.  In a subsequent
commit in this PR we will then rename these variables and parameters,
along with the "rootDir" field of the "fastWalker" structure, so that
they more accurately represent the functions' current purpose and
implementation.

Finally, note that the "cleans only temp files and directories older
than an hour" test in our "t/t-tempfile.sh" shell test script verifies
the behaviour of cleanupTmp() function, which employs the FastWalkDir()
function, while the TestFastWalkBasic() test function in our Go test
suite directly exercises the fastWalkDir() internal function.  Both
tests thus provide some assurance that our changes in this commit have
not introduced any unexpected regressions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants