Conversation
3dc6a37 to
24be525
Compare
775121e to
9114a57
Compare
2d5aa0b to
17befd5
Compare
0039c02 to
c99b843
Compare
chrisd8088
left a comment
There was a problem hiding this comment.
This is an amazing contribution, thank you so much! It will be greatly appreciated by a significant portion of the Git LFS community, and it will be a major new feature that distinguishes the 3.0 release. 🚀
I hope my comments and handful of suggestions are clear; please let me know if any are not!
Thank you again for working on this so thoroughly and diligently over many months. ❤️
|
Okay, I've gone through all of the comments, including the ones I didn't reply to directly, and I believe I've addressed all of the concerns unless I've specifically said I didn't think it needed to be addressed. For the moment, I've put in some squash and fixup patches to help you in a second round. The only changes to existing patches are those to resolve conflicts from the insertion of patches earlier into the series (which was required in order to maintain bisectable commits). Once you're happy with these changes as they stand, I'll just squash them down and then merge. |
chrisd8088
left a comment
There was a problem hiding this comment.
Whew! Thank you for the many, many revisions and responses! ❤️
I think I have marked everything that looked resolved as such, leaving only a couple of comments from my original review with additional notes or questions, and I had only a few extras turn up and they should be in the review now too.
I confess I did not go through the whole body of changes from top to bottom; I thought I'd wait on that and do it as part of a final pass.
Thanks again and please do let me know if anything of my comments are unclear, or just plain nonsensical or wrong. 😄
9c5f49c to
b7ad928
Compare
|
I think I've addressed all of your suggestions; I took all but one, where I think one of us may have a misunderstanding, which could very well be me (and if so, please say so). I've rebased onto the main branch, but left my changes as fixup commits at the end for easy reviewing. As mentioned before, once we're happy with where things are, I'll squash them down before merging. |
The findStandaloneTransfer() function of the "tq" package was updated in commit 28f9b73 of PR git-lfs#3905 to determine whether to use the built-in file standalone transfer adapter based on the actual endpoint URL, taking into account settings like "lfs.url" which override the endpoint URL we would otherwise construct from the current remote's URL. However, the code still checked the "lfs.<url>.standaloneTransferAgent" setting using the URL constructed from the remote's URL, and so called both the Endpoint() and RemoteEndpoint() methods of the EndpointFinder interface. Then in commit 594f8e3 of PR git-lfs#4446, when the pure SSH transfer protocol was introduced, the call to the RemoteEndpoint() method was replaced with Endpoint(), which was done with the intent that it would honour "lfs.url" settings as well. However, this resulted in duplicate calls to Endpoint(), so we can remove one of them now. Note that the change in commit 594f8e3 of PR git-lfs#4446 implies that the URL used when looking for "lfs.<url>.standaloneTransferAgent" settings is the URL as fully determined by settings such as "lfs.url" that override what would be otherwise constructed based on the remote's URL. We expect to add documentation to this effect in a subsequent PR.
|
Is there any example usage of LFS download/upload only via SSH? And do GitHub, GitLab and other mainstream Git services support it? |
|
How do I set this up? Just add |
@shamefulCake1 -- I would recommend opening a new discussion if you're interested in this topic, as this PR has been merged and closed for a few years now. Or you might find the answers you're looking for in an existing discussion like #5877. In brief, the Git LFS client as of v3.0.0 doesn't need additional tools to perform the client-side portions of the pure SSH-based Git LFS transfer protocol. However, not all Git LFS hosting services offer support for the server-side portions of the protocol yet, so the answer to your question depends a lot on what service you are using. GitLab supports the protocol now, and I believe Gitea has added support as well, but GitHub has not. If you are setting up your own Git LFS hosting service, you could use the |
|
Thank you, it seems that #5877 is indeed what I need. I recognise that this PR is several years old, but it's still the first one to come up on Google and DDG.
I'm not doing anything even remotely so complicated. I'm just trying to clone my a git repo from my workstation to my my laptop over ssh. Git is a distributed VCS, and it doesn't need a server. |
With PR git-lfs#4446 we defined a new SSHTransfer structure and set of methods on that structure in our ssh/connection.go source file, and used the name "tr" for the receiver variables of those methods. Later, in commit 5dbbf13 of PR git-lfs#5674, we imported our "tr" message translation package into the same source file, in order to format and output an error message from the startConnection() function. As it happens this function is not one of the methods of the SSHTransfer structure, and so there was no conflict with the methods' "tr" receiver variables. However, in subsequent commits in this PR we expect to revise one of the SSHTransfer structure's methods to output an error message, and will want to use the Get() method of the "tr" package's global Tr variable, which we can not do with the current name of the "tr" receiver variable, as it masks the "tr" package name within the method's scope. Therefore we now rename all our "tr" receiver variables to "st" in the ssh/connection.go source file, so as to avoid any namespace conflicts with our "tr" package.
Since PR git-lfs#4446 we have output several trace log messages when attempting to instantiate a new SSHTransfer structure in the SSHTransfer() method of the Client structure in our "lfsapi" package. This method takes "operation" and "remote" string arguments, and the returned SSHTransfer structure is specific to those values. As we may potentially call this method several times in a single process, with different arguments, we add the "operation" and "remote" variables to our trace log messages, which will help us when we analyze logs and want to distinguish the different contexts in which the SSHTransfer() method was called.
In commit 691de51 of PR git-lfs#4446 we added the "batch transfers with ssh endpoint (git-lfs-transfer)" test to our t/t-batch-transfer.sh test script in order to validate that the new SSH object transfer protocol for Git LFS was operating as expected. This test only creates a single test object, and so does not cause the Git LFS client to create multiple SSH connections during the batch object transfer phase. For this reason, no SSH connections are created with the ControlMaster option set to "no". As described in a prior commit in this PR, our lfs-ssh-echo test utility program had a bug which prevented it from accepting SSH connections with the ControlMaster option set to "no", so the "batch transfers with ssh endpoint (git-lfs-transfer)" test would have failed if it tried to push more than a single object. Such a test would also fail if it tried to fetch more than a single object, due to the separate bug described in git-lfs#5880, where we returned a nil from the Connection() method of the SSHTransfer structure in our "ssh" package when we had closed all the SSH connections, but then tried to reference that nil pointer in the batchInternal() method of the SSHBatchClient structure in the "tq" package, causing a Go panic condition. In prior commits in this PR we resolved the first problem with the lfs-ssh-echo test utility, and adjusted the Connection() method of the SSHTransfer structure to return a non-nil error when the requested connection number exceeds the maximum number of permitted SSH connections, which covers the case where all the connections have already been closed. Therefore, we can now add additional tests of the SSH object transfer protocol which push multiple objects and then fetch them. In these tests, we use the new assert_remote_object() assertion function that we added in a prior commit in this PR to confirm that the objects we push are all written to the "remote" repository. As well, we check that the expected number of SSH connections are reported in the trace logs from both the push and clone operations, and that some connections are made with the ControlMaster option set to "no" as well as "yes". One of our new tests leaves the maximum number of concurrent object transfers set to the default value of 8, so that all three of the objects we create in the test are pushed or fetched in the same batch. This requires that the Git LFS client establish three separate SSH connections, two of which should have the ControlMaster option set to "no", and the first of which should have the option set to "yes". The second of our new tests sets the maximum number of concurrent transfers to two, so we expect to see only two SSH connections established, just the first of which will have the ControlMaster option set to "yes". Since there are only two concurrent connections, we can assume the third object was transferred in a second batch. (Note that the Git LFS client, at present, always creates an extra SSH connection with the ControlMaster option set to "yes" when pushing objects over the SSH object transfer protocol. It uses this extra connection exclusively to perform locking commands. Our tests therefore expect this extra connection to be reported in the push trace logs.) Next, we expand the existing "batch transfers with ssh endpoint (git-lfs-transfer)" test to perform the same set of checks as our new tests, bringing the tests into close alignment with each other. We again use our new assert_remote_object() function to confirm that the single object pushed in this test is written to the "remote" repository. We also check that only a single SSH connection is reported in the trace logs (plus the extra one used for locking commands during the push operation). Finally, we make some small adjustments to other tests in the same test script, so as to remove unnecessary redirections of stderr, and to add one extra check to our "batch transfers with ssh endpoint (git-lfs-authenticate)" test, just to confirm that an object was successfully pushed over HTTP to our lfstest-gitserver test helper program following the initial authorization handshake over SSH.
With PR git-lfs#4446 we defined a new SSHTransfer structure and set of methods on that structure in our ssh/connection.go source file, and used the name "tr" for the receiver variables of those methods. Later, in commit 5dbbf13 of PR git-lfs#5674, we imported our "tr" message translation package into the same source file, in order to format and output an error message from the startConnection() function. As it happens this function is not one of the methods of the SSHTransfer structure, and so there was no conflict with the methods' "tr" receiver variables. However, in subsequent commits in this PR we expect to revise one of the SSHTransfer structure's methods to output an error message, and will want to use the Get() method of the "tr" package's global Tr variable, which we can not do with the current name of the "tr" receiver variable, as it masks the "tr" package name within the method's scope. Therefore we now rename all our "tr" receiver variables to "st" in the ssh/connection.go source file, so as to avoid any namespace conflicts with our "tr" package.
Since PR git-lfs#4446 we have output several trace log messages when attempting to instantiate a new SSHTransfer structure in the SSHTransfer() method of the Client structure in our "lfsapi" package. This method takes "operation" and "remote" string arguments, and the returned SSHTransfer structure is specific to those values. As we may potentially call this method several times in a single process, with different arguments, we add the "operation" and "remote" variables to our trace log messages, which will help us when we analyze logs and want to distinguish the different contexts in which the SSHTransfer() method was called.
In commit 691de51 of PR git-lfs#4446 we added the "batch transfers with ssh endpoint (git-lfs-transfer)" test to our t/t-batch-transfer.sh test script in order to validate that the new SSH object transfer protocol for Git LFS was operating as expected. This test only creates a single test object, and so does not cause the Git LFS client to establish multiple SSH connections during the batch object transfer phase. For this reason, no SSH connections are started with the ControlMaster option set to "no". As described in a prior commit in this PR, our lfs-ssh-echo test utility program had a bug which prevented it from accepting SSH connections with the ControlMaster option set to "no", so the "batch transfers with ssh endpoint (git-lfs-transfer)" test would have failed if it tried to push more than a single object. Such a test would also fail if it tried to fetch more than a single object, due to the bug described in git-lfs#5880. Specifically, we returned a nil from the Connection() method of the SSHTransfer structure in our "ssh" package when we had closed all the SSH connections, but then sometimes tried to reference that nil pointer in the batchInternal() method of the SSHBatchClient structure in the "tq" package, causing a Go panic condition. In prior commits in this PR we resolved these problems, both the bug in the lfs-ssh-echo test utility and the bad return value from the Connection() method of the SSHTransfer structure in the Git LFS client code. In the case of the latter issue, we adjusted the Connection() method to return a non-nil error when the requested connection number exceeds the maximum number of permitted SSH connections, which also covers the case where all the connections have already been closed. Therefore we can now add additional tests of the SSH object transfer protocol which push multiple objects and then fetch them. In these tests we use the new assert_remote_object() assertion function that we added in a prior commit in this PR to confirm that the objects we push are all written to the "remote" repository. As well, we check that the expected number of SSH connections are reported in the trace logs from both the push and clone operations, and that some connections are made with the ControlMaster option set to "no" as well as "yes". The first of our new tests leaves the maximum number of concurrent object transfers set to the default value of 8, so that all three of the objects we create in the test are pushed or fetched in the same batch. This requires that the Git LFS client establish three separate SSH connections, two of which should have the ControlMaster option set to "no", and the first of which should have the option set to "yes". The second of our new tests sets the maximum number of concurrent transfers to two, so we expect to see only two SSH connections established, the first of which should have the ControlMaster option set to "yes". Since there are only two concurrent connections, we can assume the third object was transferred in a second batch. (Note that the Git LFS client, at present, always creates an extra SSH connection with the ControlMaster option set to "yes" when pushing objects over the SSH object transfer protocol. It uses this extra connection exclusively to perform locking commands. Our tests therefore expect this extra connection to be reported in the push trace logs, and we include comments to explain why the expected connection totals are one higher than might otherwise be the case.) Next, we expand the existing "batch transfers with ssh endpoint (git-lfs-transfer)" test to perform the same set of checks as our new tests, bringing the tests into close alignment with each other. We again use the assert_remote_object() function to confirm that the single object pushed in this test is written to the "remote" repository. We also check that only a single SSH connection is reported in the trace logs (plus the extra one used for locking commands during the push operation). Finally, we make a few small adjustments to other tests in the same test script, removing some unnecessary redirections of stderr, and adding one extra check to our "batch transfers with ssh endpoint (git-lfs-authenticate)" test which just confirms that an object was successfully pushed over HTTP to our lfstest-gitserver test helper program, following an initial authorization handshake over SSH.
In commit 691de51 of PR git-lfs#4446 we added the "batch transfers with ssh endpoint (git-lfs-transfer)" test to our t/t-batch-transfer.sh test script in order to validate that the new SSH object transfer protocol for Git LFS was operating as expected. This test only creates a single test object, and so does not cause the Git LFS client to establish multiple SSH connections during the batch object transfer phase. For this reason, no SSH connections are started with the ControlMaster option set to "no". As described in a prior commit in this PR, our lfs-ssh-echo test utility program had a bug which prevented it from accepting SSH connections with the ControlMaster option set to "no", so the "batch transfers with ssh endpoint (git-lfs-transfer)" test would have failed if it tried to push more than a single object. Such a test would also fail if it tried to fetch more than a single object, due to the bug described in git-lfs#5880. Specifically, we returned a nil from the Connection() method of the SSHTransfer structure in our "ssh" package when we had closed all the SSH connections, but then sometimes tried to reference that nil pointer in the batchInternal() method of the SSHBatchClient structure in the "tq" package, causing a Go panic condition. In prior commits in this PR we resolved these problems, both the bug in the lfs-ssh-echo test utility and the bad return value from the Connection() method of the SSHTransfer structure in the Git LFS client code. In the case of the latter issue, we adjusted the Connection() method to return a non-nil error when the requested connection number exceeds the maximum number of permitted SSH connections, which also covers the case where all the connections have already been closed. Therefore we can now add additional tests of the SSH object transfer protocol which push multiple objects and then fetch them. In these tests we use the new assert_remote_object() assertion function that we added in a prior commit in this PR to confirm that the objects we push are all written to the "remote" repository. As well, we check that the expected number of SSH connections are reported in the trace logs from both the push and clone operations, and that some connections are made with the ControlMaster option set to "no" as well as "yes". The first of our new tests leaves the maximum number of concurrent object transfers set to the default value of 8, so that all three of the objects we create in the test are pushed or fetched in the same batch. This requires that the Git LFS client establish three separate SSH connections, two of which should have the ControlMaster option set to "no", and the first of which should have the option set to "yes". The second of our new tests sets the maximum number of concurrent transfers to two, so we expect to see only two SSH connections established, the first of which should have the ControlMaster option set to "yes". Since there are only two concurrent connections, we can assume the third object was transferred in a second batch. (Note that the Git LFS client, at present, always creates an extra SSH connection with the ControlMaster option set to "yes" when pushing objects over the SSH object transfer protocol. It uses this extra connection exclusively to perform locking commands. Our tests therefore expect this extra connection to be reported in the push trace logs, and we include comments to explain why the expected connection totals are one higher than might otherwise be the case.) Next, we expand the existing "batch transfers with ssh endpoint (git-lfs-transfer)" test to perform the same set of checks as our new tests, bringing the tests into close alignment with each other. We again use the assert_remote_object() function to confirm that the single object pushed in this test is written to the "remote" repository. We also check that only a single SSH connection is reported in the trace logs (plus the extra one used for locking commands during the push operation). Finally, we make a few small adjustments to other tests in the same test script, removing some unnecessary redirections of stderr, and adding one extra check to our "batch transfers with ssh endpoint (git-lfs-authenticate)" test which just confirms that an object was successfully pushed over HTTP to our lfstest-gitserver test helper program, following an initial authorization handshake over SSH.
With PR git-lfs#4446 we defined a new SSHTransfer structure and set of methods on that structure in our ssh/connection.go source file, and used the name "tr" for the receiver variables of those methods. Later, in commit 5dbbf13 of PR git-lfs#5674, we imported our "tr" message translation package into the same source file, in order to format and output an error message from the startConnection() function. As it happens this function is not one of the methods of the SSHTransfer structure, and so there was no conflict with the methods' "tr" receiver variables. However, in subsequent commits in this PR we expect to revise one of the SSHTransfer structure's methods to output an error message, and will want to use the Get() method of the "tr" package's global Tr variable, which we can not do with the current name of the "tr" receiver variable, as it masks the "tr" package name within the method's scope. Therefore we now rename all our "tr" receiver variables to "st" in the ssh/connection.go source file, so as to avoid any namespace conflicts with our "tr" package.
Since PR git-lfs#4446 we have output several trace log messages when attempting to instantiate a new SSHTransfer structure in the SSHTransfer() method of the Client structure in our "lfsapi" package. This method takes "operation" and "remote" string arguments, and the returned SSHTransfer structure is specific to those values. As we may potentially call this method several times in a single process, with different arguments, we add the "operation" and "remote" variables to our trace log messages, which will help us when we analyze logs and want to distinguish the different contexts in which the SSHTransfer() method was called.
In commit 691de51 of PR git-lfs#4446 we added the "batch transfers with ssh endpoint (git-lfs-transfer)" test to our t/t-batch-transfer.sh test script in order to validate that the new SSH object transfer protocol for Git LFS was operating as expected. This test only creates a single test object, and so does not cause the Git LFS client to establish multiple SSH connections during the batch object transfer phase. For this reason, no SSH connections are started with the ControlMaster option set to "no". As described in a prior commit in this PR, our lfs-ssh-echo test utility program had a bug which prevented it from accepting SSH connections with the ControlMaster option set to "no", so the "batch transfers with ssh endpoint (git-lfs-transfer)" test would have failed if it tried to push more than a single object. Such a test would also fail if it tried to fetch more than a single object, due to the bug described in git-lfs#5880. Specifically, we returned a nil from the Connection() method of the SSHTransfer structure in our "ssh" package when we had closed all the SSH connections, but then sometimes tried to reference that nil pointer in the batchInternal() method of the SSHBatchClient structure in the "tq" package, causing a Go panic condition. In prior commits in this PR we resolved these problems, both the bug in the lfs-ssh-echo test utility and the bad return value from the Connection() method of the SSHTransfer structure in the Git LFS client code. In the case of the latter issue, we adjusted the Connection() method to return a non-nil error when the requested connection number exceeds the maximum number of permitted SSH connections, which also covers the case where all the connections have already been closed. Therefore we can now add additional tests of the SSH object transfer protocol which push multiple objects and then fetch them. In these tests we use the new assert_remote_object() assertion function that we added in a prior commit in this PR to confirm that the objects we push are all written to the "remote" repository. As well, we check that the expected number of SSH connections are reported in the trace logs from both the push and clone operations, and that some connections are made with the ControlMaster option set to "no" as well as "yes". The first of our new tests leaves the maximum number of concurrent object transfers set to the default value of 8, so that all three of the objects we create in the test are pushed or fetched in the same batch. This requires that the Git LFS client establish three separate SSH connections, two of which should have the ControlMaster option set to "no", and the first of which should have the option set to "yes". The second of our new tests sets the maximum number of concurrent transfers to two, so we expect to see only two SSH connections established, the first of which should have the ControlMaster option set to "yes". Since there are only two concurrent connections, we can assume the third object was transferred in a second batch. (Note that the Git LFS client, at present, always creates an extra SSH connection with the ControlMaster option set to "yes" when pushing objects over the SSH object transfer protocol. It uses this extra connection exclusively to perform locking commands. Our tests therefore expect this extra connection to be reported in the push trace logs, and we include comments to explain why the expected connection totals are one higher than might otherwise be the case.) Next, we expand the existing "batch transfers with ssh endpoint (git-lfs-transfer)" test to perform the same set of checks as our new tests, bringing the tests into close alignment with each other. We again use the assert_remote_object() function to confirm that the single object pushed in this test is written to the "remote" repository. We also check that only a single SSH connection is reported in the trace logs (plus the extra one used for locking commands during the push operation). Finally, we make a few small adjustments to other tests in the same test script, removing some unnecessary redirections of stderr, and adding one extra check to our "batch transfers with ssh endpoint (git-lfs-authenticate)" test which just confirms that an object was successfully pushed over HTTP to our lfstest-gitserver test helper program, following an initial authorization handshake over SSH.
With PR git-lfs#4446 we defined a new SSHTransfer structure and set of methods on that structure in our ssh/connection.go source file, and used the name "tr" for the receiver variables of those methods. Later, in commit 5dbbf13 of PR git-lfs#5674, we imported our "tr" message translation package into the same source file, in order to format and output an error message from the startConnection() function. As it happens this function is not one of the methods of the SSHTransfer structure, and so there was no conflict with the methods' "tr" receiver variables. However, in subsequent commits in this PR we expect to revise one of the SSHTransfer structure's methods to output an error message, and will want to use the Get() method of the "tr" package's global Tr variable, which we can not do with the current name of the "tr" receiver variable, as it masks the "tr" package name within the method's scope. Therefore we now rename all our "tr" receiver variables to "st" in the ssh/connection.go source file, so as to avoid any namespace conflicts with our "tr" package.
Since PR git-lfs#4446 we have output several trace log messages when attempting to instantiate a new SSHTransfer structure in the SSHTransfer() method of the Client structure in our "lfsapi" package. This method takes "operation" and "remote" string arguments, and the returned SSHTransfer structure is specific to those values. As we may potentially call this method several times in a single process, with different arguments, we add the "operation" and "remote" variables to our trace log messages, which will help us when we analyze logs and want to distinguish the different contexts in which the SSHTransfer() method was called.
In commit 691de51 of PR git-lfs#4446 we added the "batch transfers with ssh endpoint (git-lfs-transfer)" test to our t/t-batch-transfer.sh test script in order to validate that the new SSH object transfer protocol for Git LFS was operating as expected. This test only creates a single test object, and so does not cause the Git LFS client to establish multiple SSH connections during the batch object transfer phase. For this reason, no SSH connections are started with the ControlMaster option set to "no". As described in a prior commit in this PR, our lfs-ssh-echo test utility program had a bug which prevented it from accepting SSH connections with the ControlMaster option set to "no", so the "batch transfers with ssh endpoint (git-lfs-transfer)" test would have failed if it tried to push more than a single object. Such a test would also fail if it tried to fetch more than a single object, due to the bug described in git-lfs#5880. Specifically, we returned a nil from the Connection() method of the SSHTransfer structure in our "ssh" package when we had closed all the SSH connections, but then sometimes tried to reference that nil pointer in the batchInternal() method of the SSHBatchClient structure in the "tq" package, causing a Go panic condition. In prior commits in this PR we resolved these problems, both the bug in the lfs-ssh-echo test utility and the bad return value from the Connection() method of the SSHTransfer structure in the Git LFS client code. In the case of the latter issue, we adjusted the Connection() method to return a non-nil error when the requested connection number exceeds the maximum number of permitted SSH connections, which also covers the case where all the connections have already been closed. Therefore we can now add additional tests of the SSH object transfer protocol which push multiple objects and then fetch them. In these tests we use the new assert_remote_object() assertion function that we added in a prior commit in this PR to confirm that the objects we push are all written to the "remote" repository. As well, we check that the expected number of SSH connections are reported in the trace logs from both the push and clone operations, and that some connections are made with the ControlMaster option set to "no" as well as "yes". The first of our new tests leaves the maximum number of concurrent object transfers set to the default value of 8, so that all three of the objects we create in the test are pushed or fetched in the same batch. This requires that the Git LFS client establish three separate SSH connections, two of which should have the ControlMaster option set to "no", and the first of which should have the option set to "yes". The second of our new tests sets the maximum number of concurrent transfers to two, so we expect to see only two SSH connections established, the first of which should have the ControlMaster option set to "yes". Since there are only two concurrent connections, we can assume the third object was transferred in a second batch. (Note that the Git LFS client, at present, always creates an extra SSH connection with the ControlMaster option set to "yes" when pushing objects over the SSH object transfer protocol. It uses this extra connection exclusively to perform locking commands. Our tests therefore expect this extra connection to be reported in the push trace logs, and we include comments to explain why the expected connection totals are one higher than might otherwise be the case.) Next, we expand the existing "batch transfers with ssh endpoint (git-lfs-transfer)" test to perform the same set of checks as our new tests, bringing the tests into close alignment with each other. We again use the assert_remote_object() function to confirm that the single object pushed in this test is written to the "remote" repository. We also check that only a single SSH connection is reported in the trace logs (plus the extra one used for locking commands during the push operation). Finally, we make a few small adjustments to other tests in the same test script, removing some unnecessary redirections of stderr, and adding one extra check to our "batch transfers with ssh endpoint (git-lfs-authenticate)" test which just confirms that an object was successfully pushed over HTTP to our lfstest-gitserver test helper program, following an initial authorization handshake over SSH.
Since PR git-lfs#4446 we have output several trace log messages when attempting to instantiate a new SSHTransfer structure in the SSHTransfer() method of the Client structure in our "lfsapi" package. This method takes "operation" and "remote" string arguments, and the returned SSHTransfer structure is specific to those values. As we may potentially call this method several times in a single process, with different arguments, we add the "operation" and "remote" variables to our trace log messages, which will help us when we analyze logs and want to distinguish the different contexts in which the SSHTransfer() method was called.
In commit 691de51 of PR git-lfs#4446 we added the "batch transfers with ssh endpoint (git-lfs-transfer)" test to our t/t-batch-transfer.sh test script in order to validate that the new SSH object transfer protocol for Git LFS was operating as expected. This test only creates a single test object, and so does not cause the Git LFS client to establish multiple SSH connections during the batch object transfer phase. For this reason, no SSH connections are started with the ControlMaster option set to "no". As described in a prior commit in this PR, our lfs-ssh-echo test utility program had a bug which prevented it from accepting SSH connections with the ControlMaster option set to "no", so the "batch transfers with ssh endpoint (git-lfs-transfer)" test would have failed if it tried to push more than a single object. Such a test would also fail if it tried to fetch more than a single object, due to the bug described in git-lfs#5880. Specifically, we returned a nil from the Connection() method of the SSHTransfer structure in our "ssh" package when we had closed all the SSH connections, but then sometimes tried to reference that nil pointer in the batchInternal() method of the SSHBatchClient structure in the "tq" package, causing a Go panic condition. In prior commits in this PR we resolved these problems, both the bug in the lfs-ssh-echo test utility and the bad return value from the Connection() method of the SSHTransfer structure in the Git LFS client code. In the case of the latter issue, we adjusted the Connection() method to return a non-nil error when the requested connection number exceeds the maximum number of permitted SSH connections, which also covers the case where all the connections have already been closed. Therefore we can now add additional tests of the SSH object transfer protocol which push multiple objects and then fetch them. In these tests we use the new assert_remote_object() assertion function that we added in a prior commit in this PR to confirm that the objects we push are all written to the "remote" repository. As well, we check that the expected number of SSH connections are reported in the trace logs from both the push and clone operations, and that some connections are made with the ControlMaster option set to "no" as well as "yes". The first of our new tests leaves the maximum number of concurrent object transfers set to the default value of 8, so that all three of the objects we create in the test are pushed or fetched in the same batch. This requires that the Git LFS client establish three separate SSH connections, two of which should have the ControlMaster option set to "no", and the first of which should have the option set to "yes". The second of our new tests sets the maximum number of concurrent transfers to two, so we expect to see only two SSH connections established, the first of which should have the ControlMaster option set to "yes". Since there are only two concurrent connections, we can assume the third object was transferred in a second batch. (Note that the Git LFS client, at present, always creates an extra SSH connection with the ControlMaster option set to "yes" when pushing objects over the SSH object transfer protocol. It uses this extra connection exclusively to perform locking commands. Our tests therefore expect this extra connection to be reported in the push trace logs, and we include comments to explain why the expected connection totals are one higher than might otherwise be the case.) Next, we expand the existing "batch transfers with ssh endpoint (git-lfs-transfer)" test to perform the same set of checks as our new tests, bringing the tests into close alignment with each other. We again use the assert_remote_object() function to confirm that the single object pushed in this test is written to the "remote" repository. We also check that only a single SSH connection is reported in the trace logs (plus the extra one used for locking commands during the push operation). Finally, we make a few small adjustments to other tests in the same test script, removing some unnecessary redirections of stderr, and adding one extra check to our "batch transfers with ssh endpoint (git-lfs-authenticate)" test which just confirms that an object was successfully pushed over HTTP to our lfstest-gitserver test helper program, following an initial authorization handshake over SSH.
Since PR git-lfs#4446 we have output several trace log messages when attempting to instantiate a new SSHTransfer structure in the SSHTransfer() method of the Client structure in our "lfsapi" package. This method takes "operation" and "remote" string arguments, and the returned SSHTransfer structure is specific to those values. As we may potentially call this method several times in a single process, with different arguments, we add the "operation" and "remote" variables to our trace log messages, which will help us when we analyze logs and want to distinguish the different contexts in which the SSHTransfer() method was called.
With PR git-lfs#4446 we defined a new SSHTransfer structure and set of methods on that structure in our ssh/connection.go source file, and used the name "tr" for the receiver variables of those methods. Later, in commit 5dbbf13 of PR git-lfs#5674, we imported our "tr" message translation package into the same source file, in order to format and output an error message from the startConnection() function. As it happens this function is not one of the methods of the SSHTransfer structure, and so there was no conflict with the methods' "tr" receiver variables. However, in subsequent commits in this PR we expect to revise one of the SSHTransfer structure's methods to output an error message, and will want to use the Get() method of the "tr" package's global Tr variable, which we can not do with the current name of the "tr" receiver variable, as it masks the "tr" package name within the method's scope. Therefore we now rename all our "tr" receiver variables to "st" in the ssh/connection.go source file, so as to avoid any namespace conflicts with our "tr" package.
In commit 691de51 of PR git-lfs#4446 we added the "batch transfers with ssh endpoint (git-lfs-transfer)" test to our t/t-batch-transfer.sh test script in order to validate that the new SSH object transfer protocol for Git LFS was operating as expected. This test only creates a single test object, and so does not cause the Git LFS client to establish multiple SSH connections during the batch object transfer phase. For this reason, no SSH connections are started with the ControlMaster option set to "no". As described in a prior commit in this PR, our lfs-ssh-echo test utility program had a bug which prevented it from accepting SSH connections with the ControlMaster option set to "no", so the "batch transfers with ssh endpoint (git-lfs-transfer)" test would have failed if it tried to push more than a single object. Such a test would also fail if it tried to fetch more than a single object, due to the bug described in git-lfs#5880. Specifically, we returned a nil from the Connection() method of the SSHTransfer structure in our "ssh" package when we had closed all the SSH connections, but then sometimes tried to reference that nil pointer in the batchInternal() method of the SSHBatchClient structure in the "tq" package, causing a Go panic condition. In prior commits in this PR we resolved these problems, both the bug in the lfs-ssh-echo test utility and the bad return value from the Connection() method of the SSHTransfer structure in the Git LFS client code. In the case of the latter issue, we adjusted the Connection() method to return a non-nil error when the requested connection number exceeds the maximum number of permitted SSH connections, which also covers the case where all the connections have already been closed. Therefore we can now add additional tests of the SSH object transfer protocol which push multiple objects and then fetch them. In these tests we use the new assert_remote_object() assertion function that we added in a prior commit in this PR to confirm that the objects we push are all written to the "remote" repository. As well, we check that the expected number of SSH connections are reported in the trace logs from both the push and clone operations, and that some connections are made with the ControlMaster option set to "no" as well as "yes". The first of our new tests leaves the maximum number of concurrent object transfers set to the default value of 8, so that all three of the objects we create in the test are pushed or fetched in the same batch. This requires that the Git LFS client establish three separate SSH connections, two of which should have the ControlMaster option set to "no", and the first of which should have the option set to "yes". The second of our new tests sets the maximum number of concurrent transfers to two, so we expect to see only two SSH connections established, the first of which should have the ControlMaster option set to "yes". Since there are only two concurrent connections, we can assume the third object was transferred in a second batch. (Note that the Git LFS client, at present, always creates an extra SSH connection with the ControlMaster option set to "yes" when pushing objects over the SSH object transfer protocol. It uses this extra connection exclusively to perform locking commands. Our tests therefore expect this extra connection to be reported in the push trace logs, and we include comments to explain why the expected connection totals are one higher than might otherwise be the case.) Next, we expand the existing "batch transfers with ssh endpoint (git-lfs-transfer)" test to perform the same set of checks as our new tests, bringing the tests into close alignment with each other. We again use the assert_remote_object() function to confirm that the single object pushed in this test is written to the "remote" repository. We also check that only a single SSH connection is reported in the trace logs (plus the extra one used for locking commands during the push operation). Finally, we make a few small adjustments to other tests in the same test script, removing some unnecessary redirections of stderr, and adding one extra check to our "batch transfers with ssh endpoint (git-lfs-authenticate)" test which just confirms that an object was successfully pushed over HTTP to our lfstest-gitserver test helper program, following an initial authorization handshake over SSH.
In commit 691de51 of PR git-lfs#4446 we added the "batch transfers with ssh endpoint (git-lfs-transfer)" test to our t/t-batch-transfer.sh test script in order to validate that the new SSH object transfer protocol for Git LFS was operating as expected. This test only creates a single test object, and so does not cause the Git LFS client to establish multiple SSH connections during the batch object transfer phase. For this reason, no SSH connections are started with the ControlMaster option set to "no". As described in a prior commit in this PR, our lfs-ssh-echo test utility program had a bug which prevented it from accepting SSH connections with the ControlMaster option set to "no", so the "batch transfers with ssh endpoint (git-lfs-transfer)" test would have failed if it tried to push more than a single object. Such a test would also fail if it tried to fetch more than a single object, due to the bug described in git-lfs#5880. Specifically, we returned a nil from the Connection() method of the SSHTransfer structure in our "ssh" package when we had closed all the SSH connections, but then sometimes tried to reference that nil pointer in the batchInternal() method of the SSHBatchClient structure in the "tq" package, causing a Go panic condition. In prior commits in this PR we resolved these problems, both the bug in the lfs-ssh-echo test utility and the bad return value from the Connection() method of the SSHTransfer structure in the Git LFS client code. In the case of the latter issue, we adjusted the Connection() method to return a non-nil error when the requested connection number exceeds the maximum number of permitted SSH connections, which also covers the case where all the connections have already been closed. Therefore we can now add additional tests of the SSH object transfer protocol which push multiple objects and then fetch them. In these tests we use the new assert_remote_object() assertion function that we added in a prior commit in this PR to confirm that the objects we push are all written to the "remote" repository. As well, we check that the expected number of SSH connections are reported in the trace logs from both the push and clone operations, and that some connections are made with the ControlMaster option set to "no" as well as "yes". To perform these SSH connection count checks we add two dedicated assertion functions, assert_ssh_transfer_connections() and assert_ssh_transfer_connection_counts(), with the former calling the latter several times. These functions are unlikely to be useful outside of the context of our tests of the SSH object transfer protocol, so we define them in the t/t-batch-transfer.sh test script rather than in our generic t/testhelpers.sh library. The first of our new tests leaves the maximum number of concurrent object transfers set to the default value of 8, so that all three of the objects we create in the test are pushed or fetched in the same batch. This requires that the Git LFS client establish as many as three separate SSH connections, of which only the first should create a control socket. The second of our new tests sets the maximum number of concurrent transfers to two, so we expect to see a maximum of only two SSH connections established, of which only the first should create the control socket. We have to adjust these basic expectations to account for two different issues, however. One issue pertains to push operations, and the other to fetches. When pushing using the SSH object transfer protocol, the Git LFS client currently always creates an extra socket control SSH connection, which it uses exclusively to perform locking commands, and does not terminate cleanly. And as a separate issue, if Git is older than version 2.11.0, it does not support the "process" filter and so Git LFS is instead invoked via the "smudge" filter once for each object. This means a unique socket control SSH connection will be created and used to fetch each object. Our new assert_ssh_transfer_connections() function adjusts its expected connection counts as necessary based on whether either of these two conditions applies. In addition to adding our two tests, we also expand our existing "batch transfers with ssh endpoint (git-lfs-transfer)" test to perform the same set of checks as the new tests, which brings all the SSH object transfer tests into alignment with each other. We again use the our assert_remote_object() function to confirm that the single object pushed in the existing test is written to the "remote" repository. We also use our assert_ssh_transfer_connections() function to check that only a single SSH connection is reported in the trace logs (plus the extra one used for locking commands during the push operation). Finally, we make a few small adjustments to other tests in the same test script, removing some unnecessary redirections of stderr, and adding two extra checks to our "batch transfers with ssh endpoint (git-lfs-authenticate)" test. We confirm that an authorization handshake was performed over SSH, and validate that an object was successfully pushed over HTTP to our lfstest-gitserver test helper program following the handshake.
In subsequent commits in this PR we expect to resolve a pair of issues which prevent us from testing the SSH object transfer protocol with more than a single object. This transfer protocol was introduced in PR git-lfs#4446, and in commit 691de51 of the PR, the "batch transfers with ssh endpoint (git-lfs-transfer)" test in our t/t-batch-transfer.sh test script was added to validate that the new protocol worked as expected. This test pushes and then fetches a single object, so the issues which arise when handling multiple objects in a batch do not cause the test to fail. Nevertheless, before addressing those issues, we first expand the existing test so it checks that the git-lfs-transfer command is seen in the trace log messages during a push operation. The test already checks that this command is used during a clone operation. As well, we enhance the "batch transfers with ssh endpoint (git-lfs-authenticate)" test, which checks that when the SSH transfer protocol is not available but Git is configured to use SSH, the Git LFS client performs an authorization handshake over SSH prior to using HTTP for its object transfers. We now confirm that the git-lfs-authenticate command is seen in the trace log messages generated during a push operation, and that the object was successfully pushed and has been stored by the remote server. We also make a small revision to the "batch transfers succeed with an empty hash algorithm" test to remove an unnecessary file redirection.
In subsequent commits in this PR we expect to resolve a pair of issues which prevent us from testing the SSH object transfer protocol with more than a single object. This transfer protocol was introduced in PR git-lfs#4446, and in commit 691de51 of the PR, the "batch transfers with ssh endpoint (git-lfs-transfer)" test in our t/t-batch-transfer.sh test script was added to validate that the new protocol worked as expected. This test pushes and then fetches a single object, so the issues which arise when handling multiple objects in a batch do not cause the test to fail. Hence we intend to add two other tests to accompany the existing one so as to validate the SSH transfer protocol with multiple objects. As these tests will all perform object transfers only over SSH, our HTTP-based lfstest-gitserver test helper program will not be used in any of them. That program retains a copy of each object it receives in memory to simulate a remote storage server. As a consequence, many of our tests use the assert_server_object() function defined in our t/testhelpers.sh library to confirm that an object has been received by the remote server; this function makes an HTTP batch request to the lfstest-gitserver program and checks the JSON response. In our tests of the SSH transfer protocol, by contrast, object data will be proxied by the lfs-ssh-echo helper program to the git-lfs-transfer command, which will write the data into a separate bare Git repository in the location provided as command argument. In fact this is how the existing test already operates; it uses the ssh_remote() function from our t/testhelpers.sh library to establish the SSH URL for its remote repository. The URLs returned by this function always include the directory path from our REMOTEDIR variable. We expect to also use the ssh_remote() function in our new tests to establish their remote repositories. In both our existing test and the ones we will add in a subsequent commit in this PR, we would like to confirm that the Git LFS objects we push have been successfully written into the "lfs/objects" cache in the remote repositories. To simplify such checks, we define a new assert_remote_object() assertion function in our shell test library. Unlike the assert_server_object() function, which makes an HTTP request, our new assertion acts in a similar manner to the existing assert_local_object() function by simply validating the size and existence of an object file in the appropriate subdirectory of the "lfs/objects" cache hierarchy of a repository. However, our new assert_remote_object() constructs the path to the repository using the REMOTEDIR variable, rather than checking the object's presence in the current local repository as the assert_local_object() function does. With the assert_remote_object() function defined, we then update our existing "batch transfers with ssh endpoint (git-lfs-transfer)" test to make use of it after pushing an object over the SSH transfer protocol, and we will use it for the same purpose in the additional tests we will introduce in a later commit in this PR.
This is an implementation of a pure SSH-based protocol. It uses the proposal outlined in the
docs/proposalsdirectory, with a few changes, which are added to the documentation.The implementation here is complete. It should be fully functional.
This implementation is being tested against Scutiger's
git-lfs-transferbinary, which is our reference implementation. That binary will be used as part of our CI jobs in the future to validate that we continue to work correctly with it. It is built from the latest crate version now that it is published on crates.io.The goal is to have the code fall back to
git-lfs-authenticateand use the HTTP-based protocol with SSH authentication if thegit-lfs-transferbinary fails for whatever reason.Fixes #1044