Merged
Conversation
ttaylorr
approved these changes
Sep 12, 2017
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are missing or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. In the "pre-push with missing pointer not on server" test in the t/t-pre-push.sh script we remove the use of the "set +e" shell option, which is not necessary to avoid test failure. Although we expect the "git lfs pre-push" command to fail, its output is piped into the tee(1) command, and when the "set -e" option is in effect the shell will exit immediately if the last command in a pipeline returns a non-zero exit code, but not if other commands in the pipeline return such a code. We do, however, add a check that the "git lfs pre-push" command exits with a non-zero code by testing the appropriate PIPESTATUS array value, following the example of the other related tests. These other tests include the "pre-push with missing and present pointers (lfs.allowincompletepush true)" and "pre-push reject missing pointers (lfs.allowincompletepush default)" tests in the t/t-pre-push.sh script, plus the four tests in the t/t-push-failures-local.sh script. In all these tests we adjust the error messages generated in the case that a "git lfs pre-push" or "git push" command fails or succeeds unexpectedly. We rewrite these messages so they are consistent with each other and with those found in many of our other test scripts. Note that in the "push reject missing objects (lfs.allowincompletepush false)" test we also correct the message reported if the "git push" command were to succeed unexpectedly. At present, the message states that we expect the command to succeed, but we actually expect it to fail. Next, in the "push reject missing objects (lfs.allowincompletepush default)" and "push reject corrupt objects (lfs.allowincompletepush default)" tests, we update our checks of the "git push" command's exit code so that we confirm the command exits with a specific non-zero exit code of 1 rather than simply checking that its exit code is not zero. This change brings these checks into alignment with those made by the other related tests. Lastly, we remove and adjust some whitespace so that these tests are all formatted in a similar manner to each other.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are missing or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. In a pair of tests in t/t-pre-push.sh script, and in another pair of tests in the t/t-push-failures-local.sh script, the tests' initial setup code creates several Git LFS objects and then removes the object file in the .git/lfs/objects directory hierarchy for one of them. In each case, we can replace this code with a simple call to our delete_local_object() test helper function, which performs the equivalent action of removing an object's file from the internal Git LFS storage directories.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
Our t/t-pre-push.sh test script includes a number of tests which validate the behaviour of the "git lfs pre-push" command when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. The "pre-push with existing file" and "pre-push with existing pointer" tests in the t/t-pre-push.sh script check similar conditions, with the only difference betwen that the latter test does not invoke the Git LFS client's "clean" filter and instead creates an object file in the .git/lfs/objects storage directories directly. Like the "pre-push with existing file" test, though, the "pre-push with existing pointer" test then pushes the new Git LFS object to the remote server and expect the object to be transferred successfully. However, unlike the "pre-push with existing file" test, the "pre-push with existing pointer" test does not confirm that the object exists in the remote server's storage,, so we add that confirmation step now using an assert_server_object() test helper function call identical to the one made at the end of the "pre-push with existing file" test.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
Our t/t-pre-push.sh test script includes a number of tests which validate the behaviour of the "git lfs pre-push" command when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. Some of the tests in the t/t-pre-push.sh script which verify the behaviour of the "git lfs pre-push" command when Git LFS objects are present or missing use an older form of the common initial test repository setup steps, where the basename(1) command is called to generate a prefix from the script's name for each test repository's name on the remote server, but then a fixed string without that prefix is used when cloning the repository from the remote. This older approach to establishing a test repository is not uncommon in our test suite and is entirely functional. However, we expect to rename a number of these particular tests in the t/t-pre-push.sh script in a subsequent commit in this PR, and also revise the repository names used by the tests at the same time, to help make the tests as consistent with each other as possible. Therefore, before renaming these tests and their repositories, we adjust their initial setup steps to use follow the same pattern as, for example, the related tests in the same script and also those found in the t/t-push-failures-local.sh script.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. In previous commits in this PR we have revised and reformatted these tests to increase their consistency with each other. One additional adjustment we make now to further increase this consistency, and to provide greater clarity as to each test's purpose, is to rename both the tests and test repositories they create and clone. Note that since these tests typically only create a single missing or corrupt object, we now use the singular word "object" rather than "objects" in the test and repository names.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
The original version of the ensureFile() method of the uploadContext structure in our "commands" package was first introduced in commit 5d239d6 of PR git-lfs#176 in 2015, and has been refactored a number of times since then, but continues to be called for each object a push operation intends to upload. The method is documented as a function that checks whether a Git LFS object file exists in the local .git/lfs/objects storage directories for a given pointer, and if it does not, tries to replace the missing object file by passing the contents of the corresponding file in the working tree through our "clean" filter. The description of PR git-lfs#176 explains the function's purpose as follows, using the original pre-release name for the Git LFS project: If the .git/hawser/objects directory gets into a weird state (for example, if the user manually removed some files in there), this attempts to re-clean the objects based on the git repository file path. The code comments preceding the ensureFile() method also describe it in the same way, as do the notes in later PRs, such as PR git-lfs#2574, in which the "lfs.allowIncompletePush" configuration option was introduced, and PR git-lfs#3398, which refined the error message the Git LFS client reports during a push operation when an object is missing locally and is also not present on the remote server. However, the ensureFile() method has never actually replaced missing object files under any circumstances. It does check whether an object file is missing from the local storage directories, and if not, tests whether a file exists in the current working tree at the path of the Git LFS pointer associated with the object. If such a file exists, the method proceeds to run the Clean() method of the GitFilter structure in our "lfs" package on the file's contents. The Clean() method calculates the SHA-256 hash value of the file's contents and creates a Pointer structure containing this hash value, and also writes a copy of the file's data into a temporary file in the .git/lfs/tmp directory. It is then the responsibility of the caller to determine whether or not this temporary file should be moved into place in the .git/lfs/objects directory hierarchy. The only other caller of the Clean() method, besides the ensureFile() method, is the clean() function in the "commands" package, which is used by multiple Git LFS commands including the "git lfs clean" and "git lfs filter-process" plumbing commands, as well as the "git lfs migrate import" command. The clean() function performs several tasks after invoking the Clean() method. First, it checks whether the file processed by the method was found to contain a Git LFS pointer; if so, no further action is taken as we assume the file in the working tree has not been passed through our "smudge" filter, and we do not want to create another pointer which simply hashes and references the existing one. Next, the clean() function checks whether a file already exists in the local .git/lfs/objects storage directories at the location into which the function would otherwise expect to move the temporary file created by the Clean() method. If a file does exist in this location and has the same size as the temporary file, no further action is taken, as we assume it contains the same contents and does not need to be updated. Assuming neither of these checks causes the clean() function to return early, the function moves the temporary file created by the Clean() method into the expected location within the .git/lfs/objects directory hierarchy. Unfortunately, because the ensureFile() method invokes the Clean() method of the GitFilter structure but never performs any of the subsequent steps taken by the clean() function, it does not ever actually recreate a missing Git LFS object from a file found in the working tree. This appears to have been the case at the time the ensureFile() method was introduced in PR git-lfs#176, and has remained so ever since. We could attempt to remedy this situation by altering the ensureFile() method so it calls the clean() function. To do so it would need to simulate the conditions under which the function usually runs, specifically within the "clean" filter context where the function is expected to transform an input data stream into an output data stream. We would likely use a Discard structure from the "io" package of the standard Go library to simply discard the output from the clean() function, as we do not need to send it back to Git in the way the "git lfs filter-process" or "git lfs clean" commands do. However, we would have to add logic to the clean() function guard against the case where the file in the working tree had different contents than those of the missing Git LFS object. Because the user may check out a Git reference in which a different file exists at the same path in the working tree, or may simply modify the file in the working tree independently, there is no guarantee that the file we pass through the Clean() method is identical to the one from which the missing Git LFS object was created. The original implementation of the ensureFile() function, although it did not fulfil its stated purpose, did include a check to verify that the SHA hash of the working tree file, as returned by the Clean() method, matched that of the missing object. This check was removed in commit 338ab40 of PR git-lfs#1812, which would likely have introduced a serious bug, except that the ensureFile() method never actually replaced any missing objects and so the removal of this check had no functional impact. While we could try to revise the ensureFile() method to operate as was originally intended, the advantages of such a change are relatively slim, and the disadvantages are several. Most obviously, it requires modifications to our clean() function to guard against the replacement of object files with incorrect data, something the other callers of the function do not need to be concerned about. The fact that this is a concern at all is in turn due to the reasonable chance that a file found in the current working tree at a given path does not contain the identical data as that of an Git LFS object generated from another file previously located at the same path. As well, the fact that the ensureFile() method has never worked as designed, despite being repeatedly refactored and enhanced over ten years, suggests that its purpose is somewhat obscure and that the requisite logic less intelligible than is ideal. Users and developers expect push operations to involve the transfer of data but not the creation (or re-creation) of local data files, so the use of our "clean" filter in such a context is not particularly intuitive. For all these reasons, we just remove the ensureFile() method entirely, which simplifies our handling of missing objects during upload transfer operations. Instead, we check for the presence of each object file we intend to push in the uploadTransfer() method of our uploadContext structure, and if a file is not found in the local storage directories, we flag it as missing, unless the "lfs.allowIncompletePush" configuration option is set to "true". One consequence of this change is that when objects to be uploaded are missing locally and are also not already present on the remote server, and when the "lfs.allowIncompletePush" Git configuration option is set to its default value of "false", we now always return a consistent error message. This occurs because, under these conditions, we now always abandon the upload operation after the Git LFS Batch API request in which we determine that an object which is missing locally is also not available on the remote server. Previously, we would sometimes continue past this stage in the upload process, and only report a failure when we tried to upload the object's data. This occurred if the ensureFile() method had found a file in the working tree at the same path as that of the missing object (regardless of that file's contents), and thus returned a "false" value to indicate that the object had been recreated and was no longer missing. Because the Git LFS client's behaviour is now more consistent when handling missing objects during a push operation, we update several tests in the t/t-pre-push.sh and t/t-push-failures-local.sh test scripts to reflect this change. In the "pre-push reject missing object" test in the t/t-pre-push.sh script and the "push reject missing object (lfs.allowincompletepush default)" test in the t/t-push-failures-local.sh script, we update the error messages expected by the tests because the client now reports an error as soon as it detects that the missing object is not present on the remote server. In the case of the latter test, this also means that we now expect that no objects are uploaded to the server, because the client abandons the upload process after the Batch API request from which it learns that the missing object is also not present on the server, rather than proceeding past this stage and uploading the object for which a local copy is available. In a previous commit in this PR we altered two other tests, in which the "lfs.allowIncompletePush" option is set to "true", because those tests only needed to remove a file from the local .git/lfs/objects storage directories to simulate the condition where a Git LFS object file is missing. These tests did not also need to remove the Git LFS pointer associated with the object from the repository's most recent commit, or remove the copy of the object file's contents from the current working tree. We can now make the same changes to the two remaining tests that perform these additional setup steps, namely the "pre-push reject missing object (lfs.allowincompletepush default)" test in the t/t-pre-push.sh script and the "push reject missing object (lfs.allowincompletepush false)" test in the t/t-push-failures-local.sh script. Because it is no longer necessary to remove a working tree file in order to avoid causing the ensureFile() method to report that the corresponding Git LFS object is actually missing, we can simplify the setup steps in these tests and bring them back into alignment with those of the related tests where the "lfs.allowIncompletePush" option is set to "true".
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are missing or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. In the "pre-push with missing pointer not on server" test in the t/t-pre-push.sh script we remove the use of the "set +e" shell option, which is not necessary to avoid test failure. Although we expect the "git lfs pre-push" command to fail, its output is piped into the tee(1) command, and when the "set -e" option is in effect the shell will exit immediately if the last command in a pipeline returns a non-zero exit code, but not if other commands in the pipeline return such a code. We do, however, add a check that the "git lfs pre-push" command exits with a non-zero code by testing the appropriate PIPESTATUS array value, following the example of the other related tests. These other tests include the "pre-push with missing and present pointers (lfs.allowincompletepush true)" and "pre-push reject missing pointers (lfs.allowincompletepush default)" tests in the t/t-pre-push.sh script, plus the four tests in the t/t-push-failures-local.sh script. In all these tests we adjust the error messages generated in the case that a "git lfs pre-push" or "git push" command fails or succeeds unexpectedly. We rewrite these messages so they are consistent with each other and with those found in many of our other test scripts. Note that in the "push reject missing objects (lfs.allowincompletepush false)" test we also correct the message reported if the "git push" command were to succeed unexpectedly. At present, the message states that we expect the command to succeed, but we actually expect it to fail. Next, in the "push reject missing objects (lfs.allowincompletepush default)" and "push reject corrupt objects (lfs.allowincompletepush default)" tests, we update our checks of the "git push" command's exit code so that we confirm the command exits with a specific non-zero exit code of 1 rather than simply checking that its exit code is not zero. This change brings these checks into alignment with those made by the other related tests. Lastly, we remove and adjust some whitespace so that these tests are all formatted in a similar manner to each other.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are missing or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. In a pair of tests in t/t-pre-push.sh script, and in another pair of tests in the t/t-push-failures-local.sh script, the tests' initial setup code creates several Git LFS objects and then removes the object file in the .git/lfs/objects directory hierarchy for one of them. In each case, we can replace this code with a simple call to our delete_local_object() test helper function, which performs the equivalent action of removing an object's file from the internal Git LFS storage directories.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
Our t/t-pre-push.sh test script includes a number of tests which validate the behaviour of the "git lfs pre-push" command when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. The "pre-push with existing file" and "pre-push with existing pointer" tests in the t/t-pre-push.sh script check similar conditions, with the only difference betwen that the latter test does not invoke the Git LFS client's "clean" filter and instead creates an object file in the .git/lfs/objects storage directories directly. Like the "pre-push with existing file" test, though, the "pre-push with existing pointer" test then pushes the new Git LFS object to the remote server and expect the object to be transferred successfully. However, unlike the "pre-push with existing file" test, the "pre-push with existing pointer" test does not confirm that the object exists in the remote server's storage,, so we add that confirmation step now using an assert_server_object() test helper function call identical to the one made at the end of the "pre-push with existing file" test.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
Our t/t-pre-push.sh test script includes a number of tests which validate the behaviour of the "git lfs pre-push" command when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. Some of the tests in the t/t-pre-push.sh script which verify the behaviour of the "git lfs pre-push" command when Git LFS objects are present or missing use an older form of the common initial test repository setup steps, where the basename(1) command is called to generate a prefix from the script's name for each test repository's name on the remote server, but then a fixed string without that prefix is used when cloning the repository from the remote. This older approach to establishing a test repository is not uncommon in our test suite and is entirely functional. However, we expect to rename a number of these particular tests in the t/t-pre-push.sh script in a subsequent commit in this PR, and also revise the repository names used by the tests at the same time, to help make the tests as consistent with each other as possible. Therefore, before renaming these tests and their repositories, we adjust their initial setup steps to use follow the same pattern as, for example, the related tests in the same script and also those found in the t/t-push-failures-local.sh script.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. In previous commits in this PR we have revised and reformatted these tests to increase their consistency with each other. One additional adjustment we make now to further increase this consistency, and to provide greater clarity as to each test's purpose, is to rename both the tests and test repositories they create and clone. Note that since these tests typically only create a single missing or corrupt object, we now use the singular word "object" rather than "objects" in the test and repository names.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Mar 28, 2025
The original version of the ensureFile() method of the uploadContext structure in our "commands" package was first introduced in commit 5d239d6 of PR git-lfs#176 in 2015, and has been refactored a number of times since then, but continues to be called for each object a push operation intends to upload. The method is documented as a function that checks whether a Git LFS object file exists in the local .git/lfs/objects storage directories for a given pointer, and if it does not, tries to replace the missing object file by passing the contents of the corresponding file in the working tree through our "clean" filter. The description of PR git-lfs#176 explains the function's purpose as follows, using the original pre-release name for the Git LFS project: If the .git/hawser/objects directory gets into a weird state (for example, if the user manually removed some files in there), this attempts to re-clean the objects based on the git repository file path. The code comments preceding the ensureFile() method also describe it in the same way, as do the notes in later PRs, such as PR git-lfs#2574, in which the "lfs.allowIncompletePush" configuration option was introduced, and PR git-lfs#3398, which refined the error message the Git LFS client reports during a push operation when an object is missing locally and is also not present on the remote server. However, the ensureFile() method has never actually replaced missing object files under any circumstances. It does check whether an object file is missing from the local storage directories, and if not, tests whether a file exists in the current working tree at the path of the Git LFS pointer associated with the object. If such a file exists, the method proceeds to run the Clean() method of the GitFilter structure in our "lfs" package on the file's contents. The Clean() method calculates the SHA-256 hash value of the file's contents and creates a Pointer structure containing this hash value, and also writes a copy of the file's data into a temporary file in the .git/lfs/tmp directory. It is then the responsibility of the caller to determine whether or not this temporary file should be moved into place in the .git/lfs/objects directory hierarchy. The only other caller of the Clean() method, besides the ensureFile() method, is the clean() function in the "commands" package, which is used by multiple Git LFS commands including the "git lfs clean" and "git lfs filter-process" plumbing commands, as well as the "git lfs migrate import" command. The clean() function performs several tasks after invoking the Clean() method. First, it checks whether the file processed by the method was found to contain a Git LFS pointer; if so, no further action is taken as we assume the file in the working tree has not been passed through our "smudge" filter, and we do not want to create another pointer which simply hashes and references the existing one. Next, the clean() function checks whether a file already exists in the local .git/lfs/objects storage directories at the location into which the function would otherwise expect to move the temporary file created by the Clean() method. If a file does exist in this location and has the same size as the temporary file, no further action is taken, as we assume it contains the same contents and does not need to be updated. Assuming neither of these checks causes the clean() function to return early, the function moves the temporary file created by the Clean() method into the expected location within the .git/lfs/objects directory hierarchy. Unfortunately, because the ensureFile() method invokes the Clean() method of the GitFilter structure but never performs any of the subsequent steps taken by the clean() function, it does not ever actually recreate a missing Git LFS object from a file found in the working tree. This appears to have been the case at the time the ensureFile() method was introduced in PR git-lfs#176, and has remained so ever since. We could attempt to remedy this situation by altering the ensureFile() method so it calls the clean() function. To do so it would need to simulate the conditions under which the function usually runs, specifically within the "clean" filter context where the function is expected to transform an input data stream into an output data stream. We would likely use a Discard structure from the "io" package of the standard Go library to simply discard the output from the clean() function, as we do not need to send it back to Git in the way the "git lfs filter-process" or "git lfs clean" commands do. However, we would have to add logic to the clean() function guard against the case where the file in the working tree had different contents than those of the missing Git LFS object. Because the user may check out a Git reference in which a different file exists at the same path in the working tree, or may simply modify the file in the working tree independently, there is no guarantee that the file we pass through the Clean() method is identical to the one from which the missing Git LFS object was created. The original implementation of the ensureFile() function, although it did not fulfil its stated purpose, did include a check to verify that the SHA hash of the working tree file, as returned by the Clean() method, matched that of the missing object. This check was removed in commit 338ab40 of PR git-lfs#1812, which would likely have introduced a serious bug, except that the ensureFile() method never actually replaced any missing objects and so the removal of this check had no functional impact. While we could try to revise the ensureFile() method to operate as was originally intended, the advantages of such a change are relatively slim, and the disadvantages are several. Most obviously, it requires modifications to our clean() function to guard against the replacement of object files with incorrect data, something the other callers of the function do not need to be concerned about. The fact that this is a concern at all is in turn due to the reasonable chance that a file found in the current working tree at a given path does not contain the identical data as that of an Git LFS object generated from another file previously located at the same path. As well, the fact that the ensureFile() method has never worked as designed, despite being repeatedly refactored and enhanced over ten years, suggests that its purpose is somewhat obscure and that the requisite logic less intelligible than is ideal. Users and developers expect push operations to involve the transfer of data but not the creation (or re-creation) of local data files, so the use of our "clean" filter in such a context is not particularly intuitive. For all these reasons, we just remove the ensureFile() method entirely, which simplifies our handling of missing objects during upload transfer operations. Instead, we check for the presence of each object file we intend to push in the uploadTransfer() method of our uploadContext structure, and if a file is not found in the local storage directories, we flag it as missing, unless the "lfs.allowIncompletePush" configuration option is set to "true". One consequence of this change is that when objects to be uploaded are missing locally and are also not already present on the remote server, and when the "lfs.allowIncompletePush" Git configuration option is set to its default value of "false", we now always return a consistent error message. This occurs because, under these conditions, we now always abandon the upload operation after the Git LFS Batch API request in which we determine that an object which is missing locally is also not available on the remote server. Previously, we would sometimes continue past this stage in the upload process, and only report a failure when we tried to upload the object's data. This occurred if the ensureFile() method had found a file in the working tree at the same path as that of the missing object (regardless of that file's contents), and thus returned a "false" value to indicate that the object had been recreated and was no longer missing. Because the Git LFS client's behaviour is now more consistent when handling missing objects during a push operation, we update several tests in the t/t-pre-push.sh and t/t-push-failures-local.sh test scripts to reflect this change. In the "pre-push reject missing object" test in the t/t-pre-push.sh script and the "push reject missing object (lfs.allowincompletepush default)" test in the t/t-push-failures-local.sh script, we update the error messages expected by the tests because the client now reports an error as soon as it detects that the missing object is not present on the remote server. In the case of the latter test, this also means that we now expect that no objects are uploaded to the server, because the client abandons the upload process after the Batch API request from which it learns that the missing object is also not present on the server, rather than proceeding past this stage and uploading the object for which a local copy is available. In a previous commit in this PR we altered two other tests, in which the "lfs.allowIncompletePush" option is set to "true", because those tests only needed to remove a file from the local .git/lfs/objects storage directories to simulate the condition where a Git LFS object file is missing. These tests did not also need to remove the Git LFS pointer associated with the object from the repository's most recent commit, or remove the copy of the object file's contents from the current working tree. We can now make the same changes to the two remaining tests that perform these additional setup steps, namely the "pre-push reject missing object (lfs.allowincompletepush default)" test in the t/t-pre-push.sh script and the "push reject missing object (lfs.allowincompletepush false)" test in the t/t-push-failures-local.sh script. Because it is no longer necessary to remove a working tree file in order to avoid causing the ensureFile() method to report that the corresponding Git LFS object is actually missing, we can simplify the setup steps in these tests and bring them back into alignment with those of the related tests where the "lfs.allowIncompletePush" option is set to "true".
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Apr 3, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are missing or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. A pair of tests in the t/t-push-failures-local.sh script, specifically the "push reject missing objects (lfs.allowincompletepush default)" and "push reject corrupt objects (lfs.allowincompletepush default)" tests, perform their setup of Git LFS objects and Git commits in a different sequence than the other tests in the same script. In order to align the code in all these tests as closely as possible, we simply revise the setup steps of the last two tests in the t/t-push-failures-local.sh script so they follow the same pattern as those of the other tests. This change has no functional effect; the only notable detail is that the tests now create all their Git LFS objects in a single commit instead of using a separate commit to create each object. As well, we reorder the lists of files we pass to the "git add" command in the first two tests in the t/t-push-failures-local.sh script so they now follow the same pattern as those of the other tests in both that script and in the t/t-pre-push.sh script. We also adjust the commit message used when creating Git LFS objects in a pair of tests in the t/t-pre-push.sh script, specifically the "pre-push with missing and present pointers (lfs.allowincompletepush true)" and "pre-push reject missing pointers (lfs.allowincompletepush default)" tests, so they are equivalent to those used in the related tests in the t/t-push-failures-local.sh script.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Apr 3, 2025
Our t/t-pre-push.sh test script includes a number of tests which validate the behaviour of the "git lfs pre-push" command when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. The "pre-push with existing file" and "pre-push with existing pointer" tests in the t/t-pre-push.sh script check similar conditions, with the only difference betwen that the latter test does not invoke the Git LFS client's "clean" filter and instead creates an object file in the .git/lfs/objects storage directories directly. Like the "pre-push with existing file" test, though, the "pre-push with existing pointer" test then pushes the new Git LFS object to the remote server and expect the object to be transferred successfully. However, unlike the "pre-push with existing file" test, the "pre-push with existing pointer" test does not confirm that the object exists in the remote server's storage, so we add that confirmation step now using an assert_server_object() test helper function call identical to the one made at the end of the "pre-push with existing file" test. Another pair of tests, the "pre-push with missing pointer not on server" test in the t/t-pre-push.sh script and the "push reject missing objects (lfs.allowincompletepush default)" test in the t/t-push-failures-local.sh script, also check similar conditions, with the difference that the former test never creates an object file in the .git/lfs/objects storage directories, while the latter allows an object file to be created by the "clean" filter, and then deletes it. In both cases, though, they expect the absent object file to cause a failure to be reported by the "git lfs pre-push" or "git push" commands, respectively. However, the "pre-push with missing pointer not on server" test, unlike the "push reject missing objects (lfs.allowincompletepush default)" test, does not call the refute_server_object() helper function to confirm the that the object does not exist in the remote server's storage after the push operation runs, and also does not check for the message "LFS upload failed" in the output of the push operation, so we add those checks now.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Apr 3, 2025
Our t/t-pre-push.sh test script includes a number of tests which validate the behaviour of the "git lfs pre-push" command when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. Some of the tests in the t/t-pre-push.sh script which verify the behaviour of the "git lfs pre-push" command when Git LFS objects are present or missing use an older form of the common initial test repository setup steps, where the basename(1) command is called to generate a prefix from the script's name for each test repository's name on the remote server, but then a fixed string without that prefix is used when cloning the repository from the remote. This older approach to establishing a test repository is not uncommon in our test suite and is entirely functional. However, we expect to rename a number of these particular tests in the t/t-pre-push.sh script in a subsequent commit in this PR, and also revise the repository names used by the tests at the same time, to help make the tests as consistent with each other as possible. Therefore, before renaming these tests and their repositories, we adjust their initial setup steps to use follow the same pattern as, for example, the related tests in the same script and also those found in the t/t-push-failures-local.sh script.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Apr 3, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. In previous commits in this PR we have revised and reformatted these tests to increase their consistency with each other. One additional adjustment we make now to further increase this consistency, and to provide greater clarity as to each test's purpose, is to rename both the tests and test repositories they create and clone. Note that since these tests typically only create a single missing or corrupt object, we now use the singular word "object" rather than "objects" in the test and repository names.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Apr 3, 2025
The original version of the ensureFile() method of the uploadContext structure in our "commands" package was first introduced in commit 5d239d6 of PR git-lfs#176 in 2015, and has been refactored a number of times since then, but continues to be called for each object a push operation intends to upload. The method is documented as a function that checks whether a Git LFS object file exists in the local .git/lfs/objects storage directories for a given pointer, and if it does not, tries to replace the missing object file by passing the contents of the corresponding file in the working tree through our "clean" filter. The description of PR git-lfs#176 explains the function's purpose as follows, using the original pre-release name for the Git LFS project: If the .git/hawser/objects directory gets into a weird state (for example, if the user manually removed some files in there), this attempts to re-clean the objects based on the git repository file path. The code comments preceding the ensureFile() method also describe it in the same way, as do the notes in later PRs, such as PR git-lfs#2574, in which the "lfs.allowIncompletePush" configuration option was introduced, and PR git-lfs#3398, which refined the error message the Git LFS client reports during a push operation when an object is missing locally and is also not present on the remote server. However, the ensureFile() method has never actually replaced missing object files under any circumstances. It does check whether an object file is missing from the local storage directories, and if not, tests whether a file exists in the current working tree at the path of the Git LFS pointer associated with the object. If such a file exists, the method proceeds to run the Clean() method of the GitFilter structure in our "lfs" package on the file's contents. The Clean() method calculates the SHA-256 hash value of the file's contents and creates a Pointer structure containing this hash value, and also writes a copy of the file's data into a temporary file in the .git/lfs/tmp directory. It is then the responsibility of the caller to determine whether or not this temporary file should be moved into place in the .git/lfs/objects directory hierarchy. The only other caller of the Clean() method, besides the ensureFile() method, is the clean() function in the "commands" package, which is used by multiple Git LFS commands including the "git lfs clean" and "git lfs filter-process" plumbing commands, as well as the "git lfs migrate import" command. The clean() function performs several tasks after invoking the Clean() method. First, it checks whether the file processed by the method was found to contain a Git LFS pointer; if so, no further action is taken as we assume the file in the working tree has not been passed through our "smudge" filter, and we do not want to create another pointer which simply hashes and references the existing one. Next, the clean() function checks whether a file already exists in the local .git/lfs/objects storage directories at the location into which the function would otherwise expect to move the temporary file created by the Clean() method. If a file does exist in this location and has the same size as the temporary file, no further action is taken, as we assume it contains the same contents and does not need to be updated. Assuming neither of these checks causes the clean() function to return early, the function moves the temporary file created by the Clean() method into the expected location within the .git/lfs/objects directory hierarchy. Unfortunately, because the ensureFile() method invokes the Clean() method of the GitFilter structure but never performs any of the subsequent steps taken by the clean() function, it never recreates a missing Git LFS object from a file found in the working tree. This appears to have been the case at the time the ensureFile() method was introduced in PR git-lfs#176, and has remained so ever since. We could attempt to remedy this situation by altering the ensureFile() method so it calls the clean() function. To do so it would need to simulate the conditions under which the function usually runs, specifically within the "clean" filter context where the function is expected to transform an input data stream into an output data stream. We would likely use a Discard structure from the "io" package of the standard Go library to simply discard the output from the clean() function, as we do not need to send it back to Git in the way the "git lfs filter-process" or "git lfs clean" commands do. However, we would have to add logic to the clean() function to guard against the case where the file in the working tree had different contents than those of the missing Git LFS object. Because the user may check out a Git reference in which a different file exists at the same path in the working tree, or may simply modify the file in the working tree independently, there is no guarantee that the file we pass through the Clean() method is identical to the one from which the missing Git LFS object was created. The original implementation of the ensureFile() function, although it did not fulfil its stated purpose, did include a check to verify that the SHA hash of the working tree file, as returned by the Clean() method, matched that of the missing object. This check was removed in commit 338ab40 of PR git-lfs#1812, which would likely have introduced a serious bug, except that the ensureFile() method never actually replaced any missing objects and so the removal of this check had no functional impact. While we could try to revise the ensureFile() method to operate as was originally intended, the advantages of such a change are relatively slim, and the disadvantages are several. Most obviously, it requires modifications to our clean() function to guard against the replacement of object files with incorrect data, something the other callers of the function do not need to be concerned about. That this is a concern at all is in turn due to the reasonable chance that a file found in the current working tree at a given path does not contain the identical data as that of an Git LFS object generated from another file previously located at the same path. As well, the fact that the ensureFile() method has never worked as designed, despite being repeatedly refactored and enhanced over ten years, suggests that its purpose is somewhat obscure and that the requisite logic less intelligible than is ideal. Users and developers expect push operations to involve the transfer of data but not the creation (or re-creation) of local data files, so the use of our "clean" filter in such a context is not particularly intuitive. For all these reasons, we just remove the ensureFile() method entirely, which simplifies our handling of missing objects during upload transfer operations. Instead, we check for the presence of each object file we intend to push in the uploadTransfer() method of our uploadContext structure, and if a file is not found in the local storage directories, we flag it as missing, unless the "lfs.allowIncompletePush" configuration option is set to "true". We also use the IsNotExist() function from the "os" package in the Go standard library to ascertain whether an object file is missing, or if some other type of error prevents us from reading its state. This mirrors the more detailed checks performed on each object file during a push operation by the partitionTransfers() method of the TransferQueue structure in the "tq" package. One consequence of this change is that when an object to be uploaded is missing locally and is also not already present on the remote server, and when the "lfs.allowIncompletePush" Git configuration option is set to its default value of "false", we now always abandon a push operation after the remote server indicates that it expects the client to upload the object. This means that in the "push reject missing object (lfs.allowincompletepush default)*" tests in our t/t-push-failures-local.sh test script, we should now expect to find a trace log message output by the client stating that the push operation's batch queue has been stopped because an object is missing on both the local system and the remote server. We added this trace log message in a prior commit in this PR, and were able to insert checks for it in several other tests in our test suite, but only because those tests either did not create a file in the working tree at all, as in the case of the "pre-push reject missing object" test in our t/t-pre-push.sh script, or removed the file in the working tree that corresponded to the object file they removed from the .git/lfs/objects storage directories. As a result of the changes in this commit, we can also now simplify the three tests that performed this extra setup step, where they used to remove the file in the working tree which corresponded to the object file they removed from the local Git LFS storage directories. This step is no longer necessary to cause the client to abandon the push operation after the server indicates that it requires an upload of an object the client has determined is missing from the local system. Therefore we can remove these extra setup steps from both of the "push reject missing object (lfs.allowincompletepush false)*" tests in the t/t-push-failures-local.sh script, and from the "pre-push reject missing object (lfs.allowincompletepush default)" test in the t/t-pre-push.sh script.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Apr 3, 2025
The original version of the ensureFile() method of the uploadContext structure in our "commands" package was first introduced in commit 5d239d6 of PR git-lfs#176 in 2015, and has been refactored a number of times since then, but continues to be called for each object a push operation intends to upload. The method is documented as a function that checks whether a Git LFS object file exists in the local .git/lfs/objects storage directories for a given pointer, and if it does not, tries to replace the missing object file by passing the contents of the corresponding file in the working tree through our "clean" filter. The description of PR git-lfs#176 explains the function's purpose as follows, using the original pre-release name for the Git LFS project: If the .git/hawser/objects directory gets into a weird state (for example, if the user manually removed some files in there), this attempts to re-clean the objects based on the git repository file path. The code comments preceding the ensureFile() method also describe it in the same way, as do the notes in later PRs, such as PR git-lfs#2574, in which the "lfs.allowIncompletePush" configuration option was introduced, and PR git-lfs#3398, which refined the error message the Git LFS client reports during a push operation when an object is missing locally and is also not present on the remote server. However, the ensureFile() method has never actually replaced missing object files under any circumstances. It does check whether an object file is missing from the local storage directories, and if not, tests whether a file exists in the current working tree at the path of the Git LFS pointer associated with the object. If such a file exists, the method proceeds to run the Clean() method of the GitFilter structure in our "lfs" package on the file's contents. The Clean() method calculates the SHA-256 hash value of the file's contents and creates a Pointer structure containing this hash value, and also writes a copy of the file's data into a temporary file in the .git/lfs/tmp directory. It is then the responsibility of the caller to determine whether or not this temporary file should be moved into place in the .git/lfs/objects directory hierarchy. The only other caller of the Clean() method, besides the ensureFile() method, is the clean() function in the "commands" package, which is used by multiple Git LFS commands including the "git lfs clean" and "git lfs filter-process" plumbing commands, as well as the "git lfs migrate import" command. The clean() function performs several tasks after invoking the Clean() method. First, it checks whether the file processed by the method was found to contain a Git LFS pointer; if so, no further action is taken as we assume the file in the working tree has not been passed through our "smudge" filter, and we do not want to create another pointer which simply hashes and references the existing one. Next, the clean() function checks whether a file already exists in the local .git/lfs/objects storage directories at the location into which the function would otherwise expect to move the temporary file created by the Clean() method. If a file does exist in this location and has the same size as the temporary file, no further action is taken, as we assume it contains the same contents and does not need to be updated. Assuming neither of these checks causes the clean() function to return early, the function moves the temporary file created by the Clean() method into the expected location within the .git/lfs/objects directory hierarchy. Unfortunately, because the ensureFile() method invokes the Clean() method of the GitFilter structure but never performs any of the subsequent steps taken by the clean() function, it never recreates a missing Git LFS object from a file found in the working tree. This appears to have been the case at the time the ensureFile() method was introduced in PR git-lfs#176, and has remained so ever since. We could attempt to remedy this situation by altering the ensureFile() method so it calls the clean() function. To do so it would need to simulate the conditions under which the function usually runs, specifically within the "clean" filter context where the function is expected to transform an input data stream into an output data stream. We would likely use a Discard structure from the "io" package of the standard Go library to simply discard the output from the clean() function, as we do not need to send it back to Git in the way the "git lfs filter-process" or "git lfs clean" commands do. However, we would have to add logic to the clean() function to guard against the case where the file in the working tree had different contents than those of the missing Git LFS object. Because the user may check out a Git reference in which a different file exists at the same path in the working tree, or may simply modify the file in the working tree independently, there is no guarantee that the file we pass through the Clean() method is identical to the one from which the missing Git LFS object was created. The original implementation of the ensureFile() function, although it did not fulfil its stated purpose, did include a check to verify that the SHA hash of the working tree file, as returned by the Clean() method, matched that of the missing object. This check was removed in commit 338ab40 of PR git-lfs#1812, which would likely have introduced a serious bug, except that the ensureFile() method never actually replaced any missing objects and so the removal of this check had no functional impact. While we could try to revise the ensureFile() method to operate as was originally intended, the advantages of such a change are relatively slim, and the disadvantages are several. Most obviously, it requires modifications to our clean() function to guard against the replacement of object files with incorrect data, something the other callers of the function do not need to be concerned about. That this is a concern at all is in turn due to the reasonable chance that a file found in the current working tree at a given path does not contain the identical data as that of an Git LFS object generated from another file previously located at the same path. As well, the fact that the ensureFile() method has never worked as designed, despite being repeatedly refactored and enhanced over ten years, suggests that its purpose is somewhat obscure and that the requisite logic is less intelligible than would be ideal. Users and developers expect push operations to involve the transfer of data but not the creation (or re-creation) of local data files, so the use of some of our "clean" filter code in such a context is not particularly intuitive. For all these reasons, we just remove the ensureFile() method entirely, which simplifies our handling of missing objects during upload transfer operations. Instead, we check for the presence of each object file we intend to push in the uploadTransfer() method of our uploadContext structure, and if a file is not found in the local storage directories, we flag it as missing, unless the "lfs.allowIncompletePush" configuration option is set to "true". We also use the IsNotExist() function from the "os" package in the Go standard library to ascertain whether an object file is missing, or if some other type of error prevents us from reading its state. This mirrors the more detailed checks performed on each object file during a push operation by the partitionTransfers() method of the TransferQueue structure in the "tq" package. One consequence of this change is that when an object to be uploaded is missing locally and is also not already present on the remote server, and when the "lfs.allowIncompletePush" Git configuration option is set to its default value of "false", we now always abandon a push operation after the remote server indicates that it expects the client to upload the object. This means that in the "push reject missing object (lfs.allowincompletepush default)*" tests in our t/t-push-failures-local.sh test script, we should now expect to find a trace log message output by the client stating that the push operation's batch queue has been stopped because an object is missing on both the local system and the remote server. We added this trace log message in a prior commit in this PR, and were able to insert checks for it in several other tests in our test suite, but only because those tests either did not create a file in the working tree at all, as in the case of the "pre-push reject missing object" test in our t/t-pre-push.sh script, or removed the file in the working tree that corresponded to the object file they removed from the .git/lfs/objects storage directories. As a result of the changes in this commit, we can also now simplify the three tests that performed this extra setup step, where they used to remove the file in the working tree which corresponded to the object file they removed from the local Git LFS storage directories. This step is no longer necessary to cause the client to abandon the push operation after the server indicates that it requires an upload of an object the client has determined is missing from the local system. Therefore we can remove these extra setup steps from both of the "push reject missing object (lfs.allowincompletepush false)*" tests in the t/t-push-failures-local.sh script, and from the "pre-push reject missing object (lfs.allowincompletepush default)" test in the t/t-pre-push.sh script.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
Apr 3, 2025
The original version of the ensureFile() method of the uploadContext structure in our "commands" package was first introduced in commit 5d239d6 of PR git-lfs#176 in 2015, and has been refactored a number of times since then, but continues to be called for each object a push operation intends to upload. The method is documented as a function that checks whether a Git LFS object file exists in the local .git/lfs/objects storage directories for a given pointer, and if it does not, tries to replace the missing object file by passing the contents of the corresponding file in the working tree through our "clean" filter. The description of PR git-lfs#176 explains the function's purpose as follows, using the original pre-release name for the Git LFS project: If the .git/hawser/objects directory gets into a weird state (for example, if the user manually removed some files in there), this attempts to re-clean the objects based on the git repository file path. The code comments preceding the ensureFile() method also describe it in the same way, as do the notes in later PRs, such as PR git-lfs#2574, in which the "lfs.allowIncompletePush" configuration option was introduced, and PR git-lfs#3398, which refined the error message the Git LFS client reports during a push operation when an object is missing locally and is also not present on the remote server. However, the ensureFile() method has never actually replaced missing object files under any circumstances. It does check whether an object file is missing from the local storage directories, and if not, tests whether a file exists in the current working tree at the path of the Git LFS pointer associated with the object. If such a file exists, the method proceeds to run the Clean() method of the GitFilter structure in our "lfs" package on the file's contents. The Clean() method calculates the SHA-256 hash value of the file's contents and creates a Pointer structure containing this hash value, and also writes a copy of the file's data into a temporary file in the .git/lfs/tmp directory. It is then the responsibility of the caller to determine whether or not this temporary file should be moved into place in the .git/lfs/objects directory hierarchy. The only other caller of the Clean() method, besides the ensureFile() method, is the clean() function in the "commands" package, which is used by multiple Git LFS commands including the "git lfs clean" and "git lfs filter-process" plumbing commands, as well as the "git lfs migrate import" command. The clean() function performs several tasks after invoking the Clean() method. First, it checks whether the file processed by the method was found to contain a Git LFS pointer; if so, no further action is taken as we assume the file in the working tree has not been passed through our "smudge" filter, and we do not want to create another pointer which simply hashes and references the existing one. Next, the clean() function checks whether a file already exists in the local .git/lfs/objects storage directories at the location into which the function would otherwise expect to move the temporary file created by the Clean() method. If a file does exist in this location and has the same size as the temporary file, no further action is taken, as we assume it contains the same contents and does not need to be updated. Assuming neither of these checks causes the clean() function to return early, the function moves the temporary file created by the Clean() method into the expected location within the .git/lfs/objects directory hierarchy. Unfortunately, because the ensureFile() method invokes the Clean() method of the GitFilter structure but never performs any of the subsequent steps taken by the clean() function, it never recreates a missing Git LFS object from a file found in the working tree. This appears to have been the case at the time the ensureFile() method was introduced in PR git-lfs#176, and has remained so ever since. We could attempt to remedy this situation by altering the ensureFile() method so it calls the clean() function. To do so it would need to simulate the conditions under which the function usually runs, specifically within the "clean" filter context where the function is expected to transform an input data stream into an output data stream. We would likely use a Discard structure from the "io" package of the standard Go library to simply discard the output from the clean() function, as we do not need to send it back to Git in the way the "git lfs filter-process" or "git lfs clean" commands do. However, we would have to add logic to the clean() function to guard against the case where the file in the working tree had different contents than those of the missing Git LFS object. Because the user may check out a Git reference in which a different file exists at the same path in the working tree, or may simply modify the file in the working tree independently, there is no guarantee that the file we pass through the Clean() method is identical to the one from which the missing Git LFS object was created. The original implementation of the ensureFile() function, although it did not fulfil its stated purpose, did include a check to verify that the SHA hash of the working tree file, as returned by the Clean() method, matched that of the missing object. This check was removed in commit 338ab40 of PR git-lfs#1812, which would likely have introduced a serious bug, except that the ensureFile() method never actually replaced any missing objects and so the removal of this check had no functional impact. While we could try to revise the ensureFile() method to operate as was originally intended, the advantages of such a change are relatively slim, and the disadvantages are several. Most obviously, it requires modifications to our clean() function to guard against the replacement of object files with incorrect data, something the other callers of the function do not need to be concerned about. That this is a concern at all is in turn due to the reasonable chance that a file found in the current working tree at a given path does not contain the identical data as that of an Git LFS object generated from another file previously located at the same path. As well, the fact that the ensureFile() method has never worked as designed, despite being repeatedly refactored and enhanced over ten years, suggests that its purpose is somewhat obscure and that the requisite logic is less intelligible than would be ideal. Users and developers expect push operations to involve the transfer of data but not the creation (or re-creation) of local data files, so the use of some of our "clean" filter code in such a context is not particularly intuitive. For all these reasons, we just remove the ensureFile() method entirely, which simplifies our handling of missing objects during upload transfer operations. Instead, we check for the presence of each object file we intend to push in the uploadTransfer() method of our uploadContext structure, and if a file is not found in the local storage directories, we flag it as missing, unless the "lfs.allowIncompletePush" configuration option is set to "true". We also use the IsNotExist() function from the "os" package in the Go standard library to ascertain whether an object file is missing, or if some other type of error prevents us from reading its state. This mirrors the more detailed checks performed on each object file during a push operation by the partitionTransfers() method of the TransferQueue structure in the "tq" package. One consequence of this change is that when an object to be uploaded is missing locally and is also not already present on the remote server, and when the "lfs.allowIncompletePush" Git configuration option is set to its default value of "false", we now always abandon a push operation after the remote server indicates that it expects the client to upload the object. This means that in the "push reject missing object (lfs.allowincompletepush default)*" tests in our t/t-push-failures-local.sh test script, we should now expect to find a trace log message output by the client stating that the push operation's batch queue has been stopped because an object is missing on both the local system and the remote server. We added this trace log message in a prior commit in this PR, and were able to insert checks for it in several other tests in our test suite, but only because those tests either did not create a file in the working tree at all, as in the case of the "pre-push reject missing object" test in our t/t-pre-push.sh script, or removed the file in the working tree that corresponded to the object file they removed from the .git/lfs/objects storage directories. As a result of the changes in this commit, we can also now simplify the three tests that performed this extra setup step, where they used to remove the file in the working tree which corresponded to the object file they removed from the local Git LFS storage directories. This step is no longer necessary to cause the client to abandon the push operation after the server indicates that it requires an upload of an object the client has determined is missing from the local system. Therefore we can remove these extra setup steps from both of the "push reject missing object (lfs.allowincompletepush false)*" tests in the t/t-push-failures-local.sh script, and from the "pre-push reject missing object (lfs.allowincompletepush default)" test in the t/t-pre-push.sh script.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
May 22, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are missing or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. In the "pre-push with missing pointer not on server" test in the t/t-pre-push.sh script we remove the use of the "set +e" shell option, which is not necessary to avoid test failure. Although we expect the "git lfs pre-push" command to fail, its output is piped into the tee(1) command, and when the "set -e" option is in effect the shell will exit immediately if the last command in a pipeline returns a non-zero exit code, but not if other commands in the pipeline return such a code. We do, however, add a check that the "git lfs pre-push" command exits with a non-zero code by testing the appropriate PIPESTATUS array value, following the example of the other related tests. These other tests include the "pre-push with missing and present pointers (lfs.allowincompletepush true)" and "pre-push reject missing pointers (lfs.allowincompletepush default)" tests in the t/t-pre-push.sh script, plus the four tests in the t/t-push-failures-local.sh script. In all these tests we adjust the error messages generated in the case that a "git lfs pre-push" or "git push" command fails or succeeds unexpectedly. We rewrite these messages so they are consistent with each other and with those found in many of our other test scripts. Note that in the "push reject missing objects (lfs.allowincompletepush false)" test we also correct the message reported if the "git push" command were to succeed unexpectedly. At present, the message states that we expect the command to succeed, but we actually expect it to fail. Next, in the "push reject missing objects (lfs.allowincompletepush default)" and "push reject corrupt objects (lfs.allowincompletepush default)" tests, we update our checks of the "git push" command's exit code so that we confirm the command exits with a specific non-zero exit code of 1 rather than simply checking that its exit code is not zero. This change brings these checks into alignment with those made by the other related tests. Lastly, we remove and adjust some whitespace so that these tests are all formatted in a similar manner to each other.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
May 22, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are missing or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. In a pair of tests in t/t-pre-push.sh script, and in another pair of tests in the t/t-push-failures-local.sh script, the tests' initial setup code creates several Git LFS objects and then removes the object file in the .git/lfs/objects directory hierarchy for one of them. In each case, we can replace this code with a simple call to our delete_local_object() test helper function, which performs the equivalent action of removing an object's file from the internal Git LFS storage directories.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
May 22, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are missing or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. A pair of tests in the t/t-push-failures-local.sh script, specifically the "push reject missing objects (lfs.allowincompletepush default)" and "push reject corrupt objects (lfs.allowincompletepush default)" tests, perform their setup of Git LFS objects and Git commits in a different sequence than the other tests in the same script. In order to align the code in all these tests as closely as possible, we simply revise the setup steps of the last two tests in the t/t-push-failures-local.sh script so they follow the same pattern as those of the other tests. This change has no functional effect; the only notable detail is that the tests now create all their Git LFS objects in a single commit instead of using a separate commit to create each object. As well, we reorder the lists of files we pass to the "git add" command in the first two tests in the t/t-push-failures-local.sh script so they now follow the same pattern as those of the other tests in both that script and in the t/t-pre-push.sh script. We also adjust the commit message used when creating Git LFS objects in a pair of tests in the t/t-pre-push.sh script, specifically the "pre-push with missing and present pointers (lfs.allowincompletepush true)" and "pre-push reject missing pointers (lfs.allowincompletepush default)" tests, so they are equivalent to those used in the related tests in the t/t-push-failures-local.sh script.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
May 22, 2025
Our t/t-pre-push.sh test script includes a number of tests which validate the behaviour of the "git lfs pre-push" command when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. The "pre-push with existing file" and "pre-push with existing pointer" tests in the t/t-pre-push.sh script check similar conditions, with the only difference betwen that the latter test does not invoke the Git LFS client's "clean" filter and instead creates an object file in the .git/lfs/objects storage directories directly. Like the "pre-push with existing file" test, though, the "pre-push with existing pointer" test then pushes the new Git LFS object to the remote server and expect the object to be transferred successfully. However, unlike the "pre-push with existing file" test, the "pre-push with existing pointer" test does not confirm that the object exists in the remote server's storage, so we add that confirmation step now using an assert_server_object() test helper function call identical to the one made at the end of the "pre-push with existing file" test. Another pair of tests, the "pre-push with missing pointer not on server" test in the t/t-pre-push.sh script and the "push reject missing objects (lfs.allowincompletepush default)" test in the t/t-push-failures-local.sh script, also check similar conditions, with the difference that the former test never creates an object file in the .git/lfs/objects storage directories, while the latter allows an object file to be created by the "clean" filter, and then deletes it. In both cases, though, they expect the absent object file to cause a failure to be reported by the "git lfs pre-push" or "git push" commands, respectively. However, the "pre-push with missing pointer not on server" test, unlike the "push reject missing objects (lfs.allowincompletepush default)" test, does not call the refute_server_object() helper function to confirm the that the object does not exist in the remote server's storage after the push operation runs, and also does not check for the message "LFS upload failed" in the output of the push operation, so we add those checks now.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
May 22, 2025
Our t/t-pre-push.sh test script includes a number of tests which validate the behaviour of the "git lfs pre-push" command when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. Some of the tests in the t/t-pre-push.sh script which verify the behaviour of the "git lfs pre-push" command when Git LFS objects are present or missing use an older form of the common initial test repository setup steps, where the basename(1) command is called to generate a prefix from the script's name for each test repository's name on the remote server, but then a fixed string without that prefix is used when cloning the repository from the remote. This older approach to establishing a test repository is not uncommon in our test suite and is entirely functional. However, we expect to rename a number of these particular tests in the t/t-pre-push.sh script in a subsequent commit in this PR, and also revise the repository names used by the tests at the same time, to help make the tests as consistent with each other as possible. Therefore, before renaming these tests and their repositories, we adjust their initial setup steps to use follow the same pattern as, for example, the related tests in the same script and also those found in the t/t-push-failures-local.sh script.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
May 22, 2025
Our t/t-pre-push.sh and t/t-push-failures-local.sh test scripts include a number of tests which validate the behaviour of the "git lfs pre-push" and "git push" commands when Git LFS objects are present, missing, or corrupt. While these tests are largely similar in their implementation, they vary in a number of formatting and implementation details unrelated to the specifics of the different conditions they simulate. These variations are artifacts of the evolution of our test suite over time; for example, the tests in the t/t-push-failures-local.sh script were refactored and collected from several earlier tests in commit 4d52e08 of PR git-lfs#3109, and the original versions of the tests in the t/t-pre-push.sh script were added incrementally in PRs git-lfs#447, git-lfs#2199, and git-lfs#2574. In a subsequent commit in this PR we expect to update the Git LFS client to remove some non-functional code which attempts to recreate missing Git LFS objects during push operations. In many cases this change will cause the client to report missing objects in an earlier stage of push operations than it does now. We also expect to reword the error message the client will output in such cases. Before we make these changes, we first revise the related tests in our test suite so they are as simple and similar as possible. This will ensure that when we update the Git LFS client we can clearly identify the changes that we need to make in our tests to accommodate the client's new behaviour. In previous commits in this PR we have revised and reformatted these tests to increase their consistency with each other. One additional adjustment we make now to further increase this consistency, and to provide greater clarity as to each test's purpose, is to rename both the tests and test repositories they create and clone. Note that since these tests typically only create a single missing or corrupt object, we now use the singular word "object" rather than "objects" in the test and repository names.
chrisd8088
added a commit
to chrisd8088/git-lfs
that referenced
this pull request
May 22, 2025
The original version of the ensureFile() method of the uploadContext structure in our "commands" package was first introduced in commit 5d239d6 of PR git-lfs#176 in 2015, and has been refactored a number of times since then, but continues to be called for each object a push operation intends to upload. The method is documented as a function that checks whether a Git LFS object file exists in the local .git/lfs/objects storage directories for a given pointer, and if it does not, tries to replace the missing object file by passing the contents of the corresponding file in the working tree through our "clean" filter. The description of PR git-lfs#176 explains the function's purpose as follows, using the original pre-release name for the Git LFS project: If the .git/hawser/objects directory gets into a weird state (for example, if the user manually removed some files in there), this attempts to re-clean the objects based on the git repository file path. The code comments preceding the ensureFile() method also describe it in the same way, as do the notes in later PRs, such as PR git-lfs#2574, in which the "lfs.allowIncompletePush" configuration option was introduced, and PR git-lfs#3398, which refined the error message the Git LFS client reports during a push operation when an object is missing locally and is also not present on the remote server. However, the ensureFile() method has never actually replaced missing object files under any circumstances. It does check whether an object file is missing from the local storage directories, and if not, tests whether a file exists in the current working tree at the path of the Git LFS pointer associated with the object. If such a file exists, the method proceeds to run the Clean() method of the GitFilter structure in our "lfs" package on the file's contents. The Clean() method calculates the SHA-256 hash value of the file's contents and creates a Pointer structure containing this hash value, and also writes a copy of the file's data into a temporary file in the .git/lfs/tmp directory. It is then the responsibility of the caller to determine whether or not this temporary file should be moved into place in the .git/lfs/objects directory hierarchy. The only other caller of the Clean() method, besides the ensureFile() method, is the clean() function in the "commands" package, which is used by multiple Git LFS commands including the "git lfs clean" and "git lfs filter-process" plumbing commands, as well as the "git lfs migrate import" command. The clean() function performs several tasks after invoking the Clean() method. First, it checks whether the file processed by the method was found to contain a Git LFS pointer; if so, no further action is taken as we assume the file in the working tree has not been passed through our "smudge" filter, and we do not want to create another pointer which simply hashes and references the existing one. Next, the clean() function checks whether a file already exists in the local .git/lfs/objects storage directories at the location into which the function would otherwise expect to move the temporary file created by the Clean() method. If a file does exist in this location and has the same size as the temporary file, no further action is taken, as we assume it contains the same contents and does not need to be updated. Assuming neither of these checks causes the clean() function to return early, the function moves the temporary file created by the Clean() method into the expected location within the .git/lfs/objects directory hierarchy. Unfortunately, because the ensureFile() method invokes the Clean() method of the GitFilter structure but never performs any of the subsequent steps taken by the clean() function, it never recreates a missing Git LFS object from a file found in the working tree. This appears to have been the case at the time the ensureFile() method was introduced in PR git-lfs#176, and has remained so ever since. We could attempt to remedy this situation by altering the ensureFile() method so it calls the clean() function. To do so it would need to simulate the conditions under which the function usually runs, specifically within the "clean" filter context where the function is expected to transform an input data stream into an output data stream. We would likely use a Discard structure from the "io" package of the standard Go library to simply discard the output from the clean() function, as we do not need to send it back to Git in the way the "git lfs filter-process" or "git lfs clean" commands do. However, we would have to add logic to the clean() function to guard against the case where the file in the working tree had different contents than those of the missing Git LFS object. Because the user may check out a Git reference in which a different file exists at the same path in the working tree, or may simply modify the file in the working tree independently, there is no guarantee that the file we pass through the Clean() method is identical to the one from which the missing Git LFS object was created. The original implementation of the ensureFile() function, although it did not fulfil its stated purpose, did include a check to verify that the SHA hash of the working tree file, as returned by the Clean() method, matched that of the missing object. This check was removed in commit 338ab40 of PR git-lfs#1812, which would likely have introduced a serious bug, except that the ensureFile() method never actually replaced any missing objects and so the removal of this check had no functional impact. While we could try to revise the ensureFile() method to operate as was originally intended, the advantages of such a change are relatively slim, and the disadvantages are several. Most obviously, it requires modifications to our clean() function to guard against the replacement of object files with incorrect data, something the other callers of the function do not need to be concerned about. That this is a concern at all is in turn due to the reasonable chance that a file found in the current working tree at a given path does not contain the identical data as that of an Git LFS object generated from another file previously located at the same path. As well, the fact that the ensureFile() method has never worked as designed, despite being repeatedly refactored and enhanced over ten years, suggests that its purpose is somewhat obscure and that the requisite logic is less intelligible than would be ideal. Users and developers expect push operations to involve the transfer of data but not the creation (or re-creation) of local data files, so the use of some of our "clean" filter code in such a context is not particularly intuitive. For all these reasons, we just remove the ensureFile() method entirely, which simplifies our handling of missing objects during upload transfer operations. Instead, we check for the presence of each object file we intend to push in the uploadTransfer() method of our uploadContext structure, and if a file is not found in the local storage directories, we flag it as missing, unless the "lfs.allowIncompletePush" configuration option is set to "true". We also use the IsNotExist() function from the "os" package in the Go standard library to ascertain whether an object file is missing, or if some other type of error prevents us from reading its state. This mirrors the more detailed checks performed on each object file during a push operation by the partitionTransfers() method of the TransferQueue structure in the "tq" package. One consequence of this change is that when an object to be uploaded is missing locally and is also not already present on the remote server, and when the "lfs.allowIncompletePush" Git configuration option is set to its default value of "false", we now always abandon a push operation after the remote server indicates that it expects the client to upload the object. This means that in the "push reject missing object (lfs.allowincompletepush default)*" tests in our t/t-push-failures-local.sh test script, we should now expect to find a trace log message output by the client stating that the push operation's batch queue has been stopped because an object is missing on both the local system and the remote server. We added this trace log message in a prior commit in this PR, and were able to insert checks for it in several other tests in our test suite, but only because those tests either did not create a file in the working tree at all, as in the case of the "pre-push reject missing object" test in our t/t-pre-push.sh script, or removed the file in the working tree that corresponded to the object file they removed from the .git/lfs/objects storage directories. As a result of the changes in this commit, we can also now simplify the three tests that performed this extra setup step, where they used to remove the file in the working tree which corresponded to the object file they removed from the local Git LFS storage directories. This step is no longer necessary to cause the client to abandon the push operation after the server indicates that it requires an upload of an object the client has determined is missing from the local system. Therefore we can remove these extra setup steps from both of the "push reject missing object (lfs.allowincompletepush false)*" tests in the t/t-push-failures-local.sh script, and from the "pre-push reject missing object (lfs.allowincompletepush default)" test in the t/t-pre-push.sh script.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This implements two fixes to make LFS pushes better:
lfs.allowincompletepushis now on by default. It's easier to push repositories when you're missing LFS content. It still warns about those missing files, but at least you're not blocked.ensureFile()helper wasn't obeying thelfs.allowincompletepush. That helper detects cases where a.git/lfs/objects/{oid}file is missing, and attempts to fill it in from the file in the working directory. Fails on any modified or deleted files in the current HEAD, of course.