transform DoWithAuth from recurrent to loop approach to prevent endless authentication attempts#6018
Conversation
|
Hey, thanks very much for this bug report and PR! This looks like a plausible fix for the problem, but I've only taken a brief look so far. Unfortunately we have a small queue of PRs awaiting review, so it might take me a while before I can give this one its proper due. In the meantime I've started our CI suite, and it looks like there are some Go tests that are failing and so those will need to be reconsidered, at a minimum. (The Go 1.22 job is also failing right now for an unrelated reason, so don't worry about that one.) In addition to the Go tests, it would be great if we can add some shell tests to whichever of our As a start in that direction, would you be able to provide a reproduction test case here in a comment, at least? That would really help us understand how we can reproduce the problem in our test suite, so that we can in turn demonstrate that this PR's changes have the intended effect of resolving the issue. Thank you again for taking the time to contribute to our project and helping to improve Git LFS for all our users! |
|
Hello. Thank you for your attention. After that, I switched the lfs executable to having my change, removed the cloned repository, and tried again. We can see that the command was terminated after 3 failed attempts to fetch each file. |
|
Thank you very much for the additional trace logs and the reproduction case and script; those will help a lot! We've got a few PRs to review at the moment, each of which happens to involve somewhat non-trivial changes (including this PR), so I apologize again that it may take us a little while longer to give this PR a careful look. Thank you again for investigating this problem and for writing up this PR, and for your continued patience! |
This comment was marked as spam.
This comment was marked as spam.
|
Thanks again for reporting this bug and providing the logs and a PR! I spent some time looking at the issues here and we appear to have a number of similar kinds of potential problems, and a lack of tests that would expose them. The first thing I notice is that your Nexus server is presumably replying with a 401 status code when it receives invalid credentials, rather than a 403, and that's not a behaviour we handle well because the Git LFS client just keeps sending the same credentials again. Arguably that's an issue with the Nexus server, but for our purposes, it reveals a bug on our side that we need to fix. Looking back over the history of this code, it seems possible to me that we've never had adequate protections against a server which continued to return 401 status codes. Replication in TestsWe can simulate the problem you've encountered in a various ways in our test suite. In the Or if we change our A more subtle kind of reproduction case occurs in our @@ -328,7 +328,7 @@ begin_test "credentials can authenticate with multistage auth"
reponame="auth-multistage-token"
setup_remote_repo "$reponame"
- printf 'Multistage::cred2:state1:state2:\nMultistage::cred1::state1:true' > "$CREDSDIR/127.0.0.1--$reponame"
+ printf 'Multistage::cred1:state1:state1:\nMultistage::cred1::state1:true' > "$CREDSDIR/127.0.0.1--$reponame"
clone_repo "$reponame" "$reponame"
git checkout -b new-branchYet another, different reproduction case can be triggered by making the following change to our @@ -8,10 +8,12 @@ ensure_git_version_isnt $VERSION_LOWER "2.3.0"
begin_test "download authenticated object"
(
set -e
- reponame="$(basename "$0" ".sh")"
+ reponame="requirecreds-$(basename "$0" ".sh")"
setup_remote_repo "$reponame"
clone_repo "$reponame" without-creds
+ printf ":requirecreds:pass" > "$CREDSDIR/127.0.0.1--$reponame"
+
git lfs track "*.dat"
printf "object-authenticated" > hi.dat
git add hi.datRecursion in HTTP HandlingThe penultimate test mentioned above is interesting because it dates from PR #5803, and in commit 6783078 of that PR we added the (UPDATE: The description above is largely correct, but omits the detail that the On this point, we have two levels of potential recursion: There's also a very similar recursion in the What's interesting about both of these implementations which handle 3xx redirections through recursion is that they suffer from effectively the same problem you reported, where the Git LFS client doesn't give up after a certain number of HTTP requests. If a server or servers send a chain of 3xx responses which result in an endless loop of redirections, the client will follow them without ceasing, and authentication doesn't even need to be part of the picture in this case. As for the final test mentioned above, that exercises yet a fourth recursive method, the Recursion LimitsIntriguingly, we appear to have once have had functioning protection against following an endless loop of HTTP redirection, with some logic originally added in PR #1791 and then simplified in PR #1839. The In PR #3028 this logic was updated so as to distinguish requests where the response was a 401 status code and not include those in count of redirections, and also avoid sending the In the latter situation the However, when I believe the changes in PR #3244, when the Unfortunately, in both cases, although the Hence the intended limit of at most three HTTP redirections is not respected. Meanwhile, as noted above, a different attempt at limiting the number of multi-stage authentication requests was added in PR #5803, but that doesn't seem to be as effective as we might like. It's definitely possible to set up multi-stage authorization which exceeds the limit of three authorization attempts. That might be acceptable, of course, but presumably we do want to limit the number of times we send identical Comprehensive MitigationOne concern which all of foregoing raises, I think, is how we should handle these various different situations, as well as a few others which occurred to me. For one thing, the code comment in the We've discussed a few potential infinite loop conditions above, including:
Another, more challenging condition to consider might be if a series of servers send a mixed series of 3xx and 401 responses which ultimately cause a loop. For instance, server A first replies with a 401, causing the client to look up credentials and send them, at which point the server sends with a 307 sending the client to server B. Server B then requests credentials with a 401, after which it replies with a 307 sending the client back to server A. Ideally, the client should cache the credentials it retrieves for both servers and send them without being prompted by a 401, at which point the servers' responses should devolve into a chain of redirections, which is just the first condition in the list above. I think we'll have to address these problems in a series of PRs, rather than just one or even two. As for this PR, tackling the problem of repeated 401 responses and maybe also making the multi-stage authentication limits function as they were intended to is lots for one PR! We do, however, need to add some fairly extensive tests in both our Go and shell test suites to prove the fixes are effective. As well, since we may not want to limit valid multi-stage authentication to three stages, we probably either need to make the limit configurable, or find a way to detect when we're resending the same Finally, thank you again very much for the bug report and proposed code changes! Analyzing this problem has certainly produced some interesting findings. |
|
As an addendum to my comments above regarding the code from PR #5803, I should note that it is possible for the Moreover, because the increment occurs after I believe that check was intended to guard against a different form of recursive call, where To exercise the check as it stands now, we can simulate the following sequence of requests and responses. First, a server replies to an unauthenticated request with a 401, which causes a single recursion through the This causes the client to make a recursive call back to The client therefore makes another recursive call to Eventually, once the server returns a non-redirection status code, the final While interesting, I don't think that behaviour is what was intended when the multi-stage authentication support was added, so we'll have to consider what the behaviour should be instead as part of this PR, and write some tests to verify that whatever revisions we make perform as we expect. For my own future reference, in our shell test suite we can establish a sufficiently long multi-stage authentication sequence on the client side using test credential record files with content like the following: We then need to create appropriate copies for each of the hostnames involved, so that our |
There was a problem hiding this comment.
Thanks very much again for identifying this problem and tackling a potential solution!
While doing my own investigation and code analysis I've turned up a few related issues, and used this PR to make a bunch of notes for the future—my apologies for that.
I've written up a first set of review comments and suggestions, which I think should further simplify our code and also resolve the immediate Go test failures seen in the CI jobs for this PR at the moment. For clarity, I'll include my final suggested versions of the three key methods below.
We will, though, still need to work on adding some Go and shell tests which exercise the authorization request limit, as we don't have anything right now. If we did, we'd have noticed the potential for endless recursion! So we'll have to collaborate on a set of tests to demonstrate the effectiveness of the new code.
(And as a note to myself, we should also address the basicUploadAdapter structure's recursive makeRequest() method in the tq package, since it suffers from the same problem as DoWithAuth(). It's a bit harder to trigger, though, as it requires a server to send authenticated: true for objects, then demand authorization using local credentials, and then reject those and continue to respond with 401s.)
Again, thank you for reporting this issue and finding a solution!
Here is my suggested version of the DoWithAuth() method:
func (c *Client) DoWithAuth(remote string, access creds.Access, req *http.Request) (*http.Response, error) {
for range defaultMaxAuthAttempts {
res, err := c.doWithAuth(remote, access, req, nil)
if err == nil || !errors.IsAuthError(err) {
return res, err
}
// We expect this condition should never occur because
// doWithAuth() will delete any Authorization header
// after a 401 status code is received.
if len(req.Header.Get("Authorization")) != 0 {
return res, err
}
// This case represents a rejected request that
// should have been authenticated but wasn't.
access = c.Endpoints.AccessFor(access.URL())
tracerx.Printf("api: http response indicates %q authentication. Resubmitting...", access.Mode())
}
return nil, fmt.Errorf("too many authentication attempts")
}Next, we can remove count from the call to doWithAuth() in the DoWithAuthNoRetry() method:
func (c *Client) DoWithAuthNoRetry(remote string, access creds.Access, req *http.Request) (*http.Response, error) {
return c.doWithAuth(remote, access, req, nil)
}We can remove the count variable from the doWithAuth() method, and also eliminate one level of indentation by combining two conditions that go well together:
unc (c *Client) doWithAuth(remote string, access creds.Access, req *http.Request, via []*http.Request) (*http.Response, error) {
req.Header = c.client.ExtraHeadersFor(req)
credWrapper, err := c.getCreds(remote, access, req)
if err != nil {
return nil, err
}
c.credContext.SetStateFields(credWrapper.Creds["state[]"])
res, err := c.doWithCreds(req, credWrapper, access, via)
if err != nil && errors.IsAuthError(err) {
multistage := credWrapper.Creds.IsMultistage()
newMode, newModes, headers := getAuthAccess(res, access.Mode(), c.access, multistage)
newAccess := access.Upgrade(newMode)
if newAccess.Mode() != access.Mode() {
c.Endpoints.SetAccess(newAccess)
c.access = newModes
}
if credWrapper.Creds != nil {
req.Header.Del("Authorization")
if !multistage {
credWrapper.CredentialHelper.Reject(credWrapper.Creds)
}
}
c.credContext.SetWWWAuthHeaders(headers)
}
if res != nil && res.StatusCode < 300 && res.StatusCode > 199 {
credWrapper.CredentialHelper.Approve(credWrapper.Creds)
}
return res, err
}Lastly, the count variable can also be dropped from the doWithCreds() method:
func (c *Client) doWithCreds(req *http.Request, credWrapper creds.CredentialHelperWrapper, access creds.Access, via []*http.Request) (*http.Response, error) {
if access.Mode() == creds.NegotiateAccess {
return c.doWithNegotiate(req, credWrapper)
}
req.Header.Set("User-Agent", lfshttp.UserAgent)
client, err := c.client.HttpClient(req.URL, access.Mode())
if err != nil {
return nil, err
}
redirectedReq, res, err := c.client.DoWithRedirect(client, req, "", via)
if err != nil || res != nil {
return res, err
}
if redirectedReq == nil {
return res, errors.New(tr.Tr.Get("failed to redirect request"))
}
return c.doWithAuth("", access, redirectedReq, via)
}|
Hi @chrisd8088, thank you for your help. Do you have any updates related to this issue? |
|
Thank you again for the PR and for identifying this issue!
The main first steps, I think, are the ones I identified in my code review. Would you have time to look at those and adjust the PR accordingly? I put the revised versions of the key methods into this comment above, if that helps. Once we've made those changes to this PR, we'll also need to expand the test suite, as I noted in that comment. But as a first step, adopting the revisions from my PR review should at least allow the CI suite to pass. If you don't have time to look at my suggestions, that's fine; I expect to circle back to this PR and several others in the not-too-distant future, and can draft my own version of this PR which would then replace yours. If you're able to help in the meantime, though, that would of course be appreciated. Either way, though, thank you again for pinpointing the problem and drafting a first potential solution! |
|
Hi @chrisd8088!
Of course, we will take care of adjusting the PR. Also, thank you so much for the detailed comments and explanations above, they are extremely helpful and very much appreciated! |
|
Thank you very much @tamaracvjetkovic! Just let me know if anything in my review comments and suggested revisions is unclear! |
97c7058 to
e87ac8f
Compare
… 'defaultMaxAuthAttempts' repeated 401 responses
…tistage helper in auth_test.go
|
Hi @chrisd8088! In the previous commits, I included the changes you suggested, thank you very much for the helpful comments! I tried running the GitHub Actions workflow in a fork of git-lfs, and two jobs failed on both the PR branch and the latest main, so I don't believe the failures are related to these changes. I also added some Go tests that should target the 401 retry loop you mentioned. |
|
Also, regarding this reproduction case - it makes sense that the issue would happen if the client keeps sending the same credentials for the second stage of the multi-stage auth process. However, when I ran the test with the changes you mentioned (on the latest main branch), I didn't see the behavior you described, and the test passed? I also tried to reproduce the same behavior by modifying the shell test on my own, but I couldn't get the expected failure. Related to that, I wrote a Go test that should simulate a similar scenario by using a custom helper which always returns the same credentials, causing the "stuck" effect. When I run this test in the main branch, it does get stuck, on the PR branch, it passes, so I believe that the test is configured as described and exercises the correct behavior. Please let me know if I'm misunderstanding anything. |
|
One more thing, while writing the tests and thinking about the 401 loop scenario, I was wondering what is the expected behavior if the server returns 401 without any authentication challenge headers (no Lfs-Authenticate and/or WWW-Authenticate)? In that case, is the behavior to still retry |
|
Thank you very much for all the work on this PR, @tamaracvjetkovic! I will take a close look as soon as I can. As a first step, I've kicked off the CI jobs as you requested, so we can see if those pass. Thank you again so much for picking up this PR and working on the complicated set of issues it exposed! |
|
@tamaracvjetkovic — My sincere apologies for my delayed response to your work on this PR, and your questions. (Getting the v3.7.1 release written and tested consumed more or less all of my available time for the last few months.) A few quick responses to your questions:
Ah, that's because I made a slight typo, which I've now corrected. The patch required to make the --- a/t/t-credentials.sh
+++ b/t/t-credentials.sh
@@ -328,7 +328,7 @@ begin_test "credentials can authenticate with multistage auth"
reponame="auth-multistage-token"
setup_remote_repo "$reponame"
- printf 'Multistage::cred2:state1:state2:\nMultistage::cred1::state1:true' > "$CREDSDIR/127.0.0.1--$reponame"
+ printf 'Multistage::cred1:state1:state1:\nMultistage::cred1::state1:true' > "$CREDSDIR/127.0.0.1--$reponame"
clone_repo "$reponame" "$reponame"
git checkout -b new-branch
That's great! I don't see anything missing there at all.
RFC 7235 states: "The server generating a 401 response MUST send a So I think it would be entirely appropriate if we simply stopped retrying after a 401 response without a However, since our Either way, it would be ideal if we have a test to check our handling of that case too ... thank you for the thoughtful question! |
chrisd8088
left a comment
There was a problem hiding this comment.
Thank you so much for these revisions!
I'm going to approve this PR for now, even though we may want to expand it a bit further with some additional tests, and the one code comment I suggested regarding the new TestDoWithAuthNoRetryOn401WhenAuthHeaderPresent test.
I'm also going to merge in the latest changes from main and let the CI jobs run again.
|
@tamaracvjetkovic — Thanks again for all your work on this PR! I'd love to merge these changes, as I've left this PR to linger for quite some time (for which I sincerely apologize!) If by chance you're not free right now to work on it further, I fully understand. If you're willing to invite me to collaborate on your branch, I can add a few shell tests and the one code comment I suggested in my review. Otherwise, I'll just leave the PR as it stands for now and circle back to it again once I've reviewed some of the other PRs we have pending. Thank you again very much for helping to improve Git LFS! |
|
Hello @chrisd8088, I am sorry for the late response. I have added you to the list of collaborators. Thank you! |
Thank you very much, @manturovDan! And I'm the one who should apologize for a late response, not you! With the v3.7.1 security patch release complete, I'm glad to have some time to circle back to important pending PRs like this one. Thanks to both you and @tamaracvjetkovic for all the help with this PR! |
The DoWithAuth() method of the Client structure in our "lfsapi" package calls the doWithAuth() method in a loop, and each time it passes its "req" parameter, which is a Request structure from the "http" package of the Go standard library. The doWithAuth() method then passes this Request structure to other methods, which may modify it to add or remove an Authorization header. In a prior commit in this PR we added a comment to the DoWithAuth() method, following a suggestion from the initial review of this PR: git-lfs#6018 (review) The comment states that the doWithAuth() method will always remove the Authorization header from the Request structure if a 401 Unauthorized status code was returned in response to the HTTP request, and so the DoWithAuth() method should never find a non-empty Authorization header after the doWithAuth() method has returned. This is incorrect, however. Further, it contradicts a second comment that we proposed during another review of this PR. In that review we suggested this other comment should be included in the TestDoWithAuthNoRetryOn401WhenAuthHeaderPresent() test function, since we introduced that function in a previous commit in this PR: git-lfs#6018 (comment) To resolve these problems we now add a lightly revised version of the second comment to the TestDoWithAuthNoRetryOn401WhenAuthHeaderPresent() test function, and correct the comment in the DoWithAuth() method so that comment better reflects the actual expected behaviour of the doWithAuth() method, which we describe in more detail below. The doWithAuth() method invokes the getCreds() method, which uses the requestHasAuth() function to check whether the Request structure already has a defined Authorization method, and if so, returns a CredentialHelperWrapper structure from our "creds" package with a nil value in its Creds element. The doWithAuth() method then invokes the doWithCreds() method to actually perform the HTTP request. If that method returns an error, the doWithAuth() method checks whether it was due to a 401 status code in the response by calling the IsAuthError() function from our "errors" package. Should that evaluate to "true", the doWithAuth() method then checks whether the Creds element of the CredentialHelperWrapper structure returned by the getCreds() method is nil, and only removes the Authorization header from the Request structure if the Creds element is nil. Therefore if the DoWithAuth() method is passed a Request structure with a defined Authorization header, that header will not be removed by the doWithAuth() method even if a 401 status code is received, and so the DoWithAuth() method may find a non-empty Authorization header following its call to the doWithAuth() method.
In prior commits in this PR we added several new test functions to our "lfsapi" package to verify the behaviour of the DoWithAuth() method of that package's Client structure. These new functions include the TestDoWithAuthRetryLimitExceeded() function and the TestDoWithAuthMultistageRetryLimitExceeded() function, both of which check that the DoWithAuth() method now limits the number of HTTP requests it will make if each request receives a 401 Unauthorized status code in its response. Both test functions follow a similar design as the existing TestDoWithAuthApprove() function. That older test function ends with a check of the access mode which has been cached for the endpoint URL associated with the HTTP requests made by the DoWithAuth() method, as we expect that after the first 401 status code response is received, this mode will be set to the BasicAccess mode from our "creds" package to reflect that HTTP Basic Authentication should be used with all subsequent requests. Although our new TestDoWithAuthRetryLimitExceeded() function also ends with a check of the cached access mode, this is not yet the case for the TestDoWithAuthMultistageRetryLimitExceeded() function, so we add that check now. We also take the opportunity to simplify all of the access mode assertion checks we make, which at present explicitly pass structure pointers as the receiver parameters of the Mode() method of the Access structure from our "creds" package. While this method is defined with a pointer receiver, it is not necessary to explicitly generate these pointers because Go automatically does so as a convenience feature of the language: https://go.dev/tour/methods/6 The use of explicit pointer receivers for the Mode() method calls in our test functions dates from commit 6d29072 in PR git-lfs#3941, when the Access structure was first moved into our "creds" package.
In subsequent commits in this PR we intend to add new tests to validate the changes we made to the Git LFS client in prior commits of this PR. We plan to add these tests to both the "t/t-askpass.sh" and the "t/t-credentials.sh" scripts in our shell test suite. Before we add our new tests, we first update the existing "credentials with useHttpPath, with wrong password" test in the "t/t-credentials.sh" script, and add a similar test to the "t/t-askpass.sh" script, which at present does not contain a test with an invalid set of credentials. We will then be able to use these tests as templates for the additional tests we expect to add in future commits. First, we revise the "credentials with useHttpPath, with wrong password" test to correct a check which at present will always trivially succeed. After the test runs the "git push" command, it attempts to confirm that the command failed to push a Git LFS object by checking that the command did not output a final progress message stating that the object was uploaded. To verify the lack of this message, the test uses the -c option of the grep(1) command and then checks that the returned value is zero. When the original version of this test was first introduced in commit 69aee53 of PR git-lfs#561, the "grep" command searched for the pattern "(1 of 1 files)" in the output of the "git push" command, which was a valid check given the format of the Git LFS client's progress messages at the time. We then revised the format of the client's progress messages in commit db4e3b2 of PR git-lfs#2811, and updated many of the "grep" commands in our test suite to reflect the new format at the same time. Most of these alterations were made correctly, but in the "credentials with useHttpPath, with wrong password" test the "grep" command's pattern was changed to "Uploading LFS objects: 100% (1/1), 0 B". Even if the "git push" command in the test were to unexpectedly succeed, the trailing "0 B" portion of this pattern would still not match any lines in the command's output, since the size of the test Git LFS object to be pushed is actually one byte. Thus this check will always pass, regardless of the success or failure of the "git push" command. To address this problem we simply remove the trailing portion of the pattern, as we only need to confirm that the "git push" command's output does not match the pattern "100% (1/1)". We then add another, final check to the test which uses a different technique to verify that the "git push" command did not upload a Git LFS object. Since we have already computed the OID of the object, we can call our refute_server_object() test assertion function to confirm that the object does not exist on the Git LFS remote. This is a more direct and less fragile method of checking the behaviour of the "git push" command than parsing its output messages, and should help avoid any future regressions of the type we have fixed in this commit. As well as making these changes to the "credentials with useHttpPath, with wrong password" test, we also add several new checks which use "grep" commands to ascertain that HTTP authorization errors occurred during the "git push" command. Specifically, we check that the client output two error messages, one after its request to the Git LFS lock verification API endpoint received a 403 status code response, and another after the client's request to the Git LFS Batch API endpoint received a 403 status code response. While adding these extra checks we also adjust the whitespace in the test, and we drop an "echo" statement whose value is minimal, even for debugging purposes. Finally, we replicate portions of the design of this test to create a new "askpass: push with core.askPass and wrong password" test in our "t/t-askpass.sh" script. This test establishes the same conditions as those from the existing "askpass: push with core.askPass" test, except that we set the LFS_ASKPASS_PASSWORD environment variable to contain an invalid password. As in our revised "credentials with useHttpPath, with wrong password" test, our new "askpass: push with core.askPass and wrong password" test then verifies the "git push" command made with the invalid credentials did not upload a Git LFS object, and that two 403 status code responses were received, one from the lock verification API endpoint and one from the Batch API endpoint.
In prior commits in this PR we revised the DoWithAuth() method of the Client structure in our "lfsapi" package so that it should no longer enter an infinite loop if a Git LFS API implementation repeatedly responds to requests with a 401 Unauthorized status code. To help verify that these changes are effective, we now add a pair of tests to our shell test suite, one in the "t/t-askpass.sh" script and one in the "t/t-credentials.sh" script. We base the design of these tests on the "askpass: push with core.askPass and wrong password" test, which we added in a previous commit in this PR, and the "credentials with useHttpPath, with wrong password" test, which we updated in the same previous commit. To simulate a Git LFS API implementation which always responds with a 401 status code, we make a small enhancement to the skipIfBadAuth() function in our "lfstest-gitserver" test utility. This function validates the credentials provided by the client in its request headers and returns "false" if they match the expected values for the current test repository, which allows the request to proceed. Otherwise, if the credentials provided by the client are not the expected ones, then the skipIfBadAuth() function returns "true" after first setting the response's status code to 403. When the anonymous function which handles the default HTTP route receives the "true" return value from the skipIfBadAuth() function it then curtails all subsequent request handling. Because we require that the "lfstest-gitserver" utility simulate a server which incorrectly returns a 401 status code instead of a 403 status code when presented with invalid credentials, we alter the skipIfBadAuth() function so the status code it sets in the HTTP response depends on the name of the test repository in the request URL. If the repository name ends with the suffix "-401-unauth", the response to a request with invalid credentials will contain a 401 status code instead of a 403 status code. We then add "askpass: push with core.askPass and wrong password and 401 response" and "credentials with useHttpPath, with wrong password and 401 response" tests to the "t/t-askpass.sh" and "t/t-credentials.sh" scripts, respectively. These tests closely resemble the tests we added to the same scripts in a previous commit in this PR. However, they append the suffix "-401-unauth" to the names of their repositories, and then verify that an appropriate number of requests are recorded in the trace log of a "git push" command. The tests also check that the command outputs an error message stating that the limit on attempts to request authentication was reached. Further, these tests implicitly confirm that our changes to the "lfsapi" package prevent the DoWithAuth() method from entering an endless loop, since if that were to occur, the tests would never complete. We include comments in both tests which explain why we check that the Git LFS client gathered the necessary credentials from the local configuration only five times, rather than either three or six times, as we might expect given that the current value of the defaultMaxAuthAttempts variable in the "lfsapi" package is three. First, Git LFS client should make two sets of requests during the "git push" command, one set to the Locking API and another to the Batch API. Only the latter request must succeed in order for the client to upload Git LFS objects; the former request is allowed to fail. The "lfstest-gitserver" utility will return 401 status codes to both types of request, so a total of six requests are made, three to each API. Second, the client will cache the access mode required by the Git LFS remote endpoint after the first request is made to the Locking API, and this cached data applies to the Batch API requests as well. (Specifically, we retain the access mode in the urlAccess map element of an endpointGitFinder structure, which is stored in the Endpoints element of the Client structure of our "lfsapi" package.) The initial request to the Locking API is made without an Authorization header, and so no credentials are retrieved from the local configuration for this request. In our current version of the DoWithAuth() method we count this initial request toward the total number of requests, so only two requests with Authorization headers are made to the Locking API. All requests to the Batch API, however, are made with Authorization headers, because the initial request reads the cached access mode for the Git LFS endpoint (which comprises both APIs) and therefore retrieves credentials from the local configuration, as do both of the subsequent requests. This explains why our new tests check that the "git push" command retrieved credentials from the local configuration five times and not six. Note that we anticipate that we will further refine the behaviour of the DoWithAuth() method in a subsequent commit in this PR so that requests without an Authorization header are not counted towards the total number of attempts to request authentication.
In prior commits in this PR we revised the DoWithAuth() method of the Client structure in our "lfsapi" package so that it should no longer enter an infinite loop if a Git LFS API implementation repeatedly responds to requests with a 401 Unauthorized status code. To help verify that these changes are effective, we now add a test to our "t/t-credentials.sh" shell test script. This test is similar to the existing "credentials can authenticate with multistage auth" test, which was introduced along with our support for multi-stage authentication schemes in PR git-lfs#5803. Both the existing test and our new test make use of the example "Multistage" authentication scheme used in our shell test suite, which also first appeared in PR git-lfs#5803. Our new test simulates a Git credential helper that is misconfigured or broken in such a way that after a certain stage, it begins to always return the same credentials and state. To establish this intentional misconfiguration we simply create a credential record file for our "git-credential-lfstest" utility that contains two entries, one to define the transition from the initial stage to the "state1" stage, and another to define a self-referential transition from the "state1" stage to the same "state1" stage. Note that the "git-credential-lfstest" utility searches for a matching entry in its credential record files in a sequential fashion, and that entries with an empty fourth field (the "MATCH" field) are considered to match against all states. As a result, we must place the entry for the transition from the initial stage to the "state1" stage at the end of the credential record file, or else it would always match the current stage. To help clarify this detail of our test's setup steps we include a comment which explains why the entry for the first transition must appear last in the file. We also take the opportunity to add similar comments to the two existing tests in the "t/t-credential.sh" script that create credential record files for multi-stage authentication, as they likewise place the entry for their first transitions at the end of these files. (As well, we slightly adjust the whitespace formatting of the existing "credentials can authenticate with multistage auth" test so that it more closely resembles our new "credentials with multistage auth loop fails" test, as well as that of another multi-stage authentication test we expect to introduce in a subsequent commit in this PR.) After our new test establishes its misconfigured credential record file, it performs a "git push" command and then performs the same set of checks as are found in the "credentials with useHttpPath, with wrong password and 401 response" test that we added in a prior commit in this PR, plus one additional check specific to multi-stage authentication. As we described in that previous commit, we expect the Git LFS client to make two sets of requests during the "git push" command, one set to the Locking API and another to the Batch API. The initial request to the Locking API is made without an Authorization header, and so no credentials are retrieved from the local configuration for this request. Because a 401 status code is received after the initial request, the client then retrieves the first stage's credential and state from the "git-credential-lfstest" helper and makes a second request, which also receives a 401 status code response. The client then retrieves the next stage's credential and state from the helper, but due to our intentional misconfiguration, these are the same as for the first state. Our "lfstest-gitserver" test utility's skipIfBadAuth() function, which determines whether to accept or reject the credentials presented in a request's Authorization header, handles our example "Multistage" authentication scheme by checking whether the credentials match either the value "cred1" or the value "cred2". The "cred2" credential value will result in the function returning "false" and accepting the credential, while the "cred1" value causes the function to return "true" and the utility to respond with a 401 status code. Since the client's second request to the Locking API also contains the "cred1" credential, it again receives a 401 status code, which due to the revisions we made to the DoWithAuth() method in prior commits in this PR should cause the client to exit from the method and report that the Locking API is not supported by the Git LFS endpoint. We then expect the client to make an initial request to the Batch API, but unlike the first request to the Locking API, an Authorization header will be included because the access mode required by the Git LFS endpoint has been cached by the requests to the Locking API. When this request receives a 401 status code response, it should then be retried two more times before the DoWithAuth() method reports an error and exits. This expected sequence of events explains why our new test checks that the "git push" command retrieved credentials five times rather than six, and why the test checks for five instances of the "cred1" credential in an Authorization header and not six. Note that we anticipate that we will further refine the behaviour of the DoWithAuth() method in a subsequent commit in this PR so that requests without an Authorization header are not counted towards the total number of attempts to request authentication.
In prior commits in this PR we revised the DoWithAuth() method of the Client structure in our "lfsapi" package so that it should no longer enter an infinite loop if a Git LFS API implementation repeatedly responds to requests with a 401 Unauthorized status code. The method now returns an error with the message "too many authentication attempts" after a maximum of three requests receive 401 status code responses. This error message is not always reported to the user, however, since certain requests the client makes are considered to be non-critical. In particular, various Git LFS commands start by making requests to the Locking API of the remote Git LFS endpoint, but Git LFS services are not required to implement that API. If these requests fail, the client just reports that the remote does not support the Locking API and then proceeds to perform the given command. In previous commits in this PR we have added several new tests to our shell test suite which exercise the DoWithAuth() method under conditions in which it now returns after making the maximum allowed number of requests. These tests can directly verify that the "too many authentication attempts" error is returned by the DoWithAuth() method when all requests to the Batch API receive 401 status code responses, because the client reports the error's message to the user. However, our tests can at present only indirectly verify that the same error is returned by the DoWithAuth() method when all requests to the Locking API receive 401 status code responses, since the error's message is never reported to the user. Instead, our tests count the number of times certain trace log messages are output by the client and confirm that these totals match what is expected if both requests to the Locking API and requests to the Batch API repeatedly receive 401 status code responses. To allow our tests to more easily check that requests to both APIs are treated equally under these conditions, we revise the DoWithAuth() method so that when the GIT_TRACE environment variable is set to a value considered "true", the method now outputs a "too many authentication attempts" trace log message just before it returns an error with that message. We then update the relevant tests in our shell test suite to check for two appearances of this message in the trace logs from the "git push" commands they execute, since each such command should make requests to both the Locking API and the Batch API, which in both cases should result in the same error condition and message.
In prior commits in this PR we revised the DoWithAuth() method of the Client structure in our "lfsapi" package so that it should no longer enter an infinite loop if a Git LFS API implementation repeatedly responds to requests with a 401 Unauthorized status code. We have also added tests which demonstrate that our changes to the DoWithAuth() method are effective, including when a server only returns 401 status codes. However, a misbehaving server is not the only condition which might cause the DoWithAuth() method to perform the maximum allowed number of requests without successfully authenticating or being rejected. In PR git-lfs#5803 we introduced support for multi-stage authentication, in which a Git credential helper returns a series of credentials and states, and Git LFS client makes a request for each state using the supplied credentials. The final state is expected to complete the authentication sequence and either be accepted or rejected. We have already added tests in previous commits in this PR which simulate a broken or misconfigured credential helper that never returns the final state in a multi-stage authentication sequence, and instead keeps looping through the intermediate states. Both the new "credentials with multistage auth loop fails" test in our "t/t-credentials.sh" shell script and the new TestDoWithAuthMultistageRetryLimitExceeded() function in our Go test suite simulate a credential helper which never advances past an intermediate authentication state. These tests demonstrate that our changes to the DoWithAuth() method ensure it will return after making the maximum allowed number of requests, even if a multi-stage authentication sequence is not complete. This was the intended behaviour of the DoWithAuth() method and the other Client structure methods at the time when PR git-lfs#5803 was developed. In commit 6783078 of that PR we introduced a "count" variable to the DoWithAuth() method, which the method then passed to the doWithAuth() method as a pointer. The doWithAuth() method incremented the "count" variable after making a multi-stage authentication request, unless the maximum number of permitted requests had been made, in which case it was supposed to report that the request was rejected. The stated intent of this design was "to avoid a credential helper getting stuck in an infinite loop if it keeps handing back the same credentials", per the description in commit 6783078. Unfortunately, this implementation did not work as expected because the DoWithAuth() method called itself recusively if the current request received a 401 status code response, and because each invocation of the method would instantiate a new "count" variable with a zero value. Thus every time the doWithAuth() method was called it could only ever increase the variable's value to one, and so it would never find that the maximum number of allowed requests had been made. Technically, as we noted in a comment on this PR, a rare combination of HTTP redirections and 401 status code responses could result in the "count" variable being incremented past one. With a sufficient number of redirections the "count" variable could reach the maximum request limit and the doWithAuth() method would then inform the credential helper that the current credentials should be rejected. However, because the doWithAuth() method still returned an error indicating that the response to the most-recent request contained a 401 status code, the DoWithAuth() method would call itself recursively and the Git LFS client would enter an infinite loop: git-lfs#6018 (comment) The revisions we have made to the DoWithAuth() and doWithAuth() methods in this PR should resolve the problems described above and ensure that misconfigured multi-stage authentication sequences will not cause the Git LFS client to repeat the same requests without ceasing. Despite these changes, the client may still exhibit unintended and unexpected behaviour if a credential helper is configured with a valid multi-stage authentication sequence, but the number of stages in that sequence exceeds the maximum number of requests we allow during an authentication sequence. In such a case, if the client makes several distinct types of requests to the same Git LFS endpoint, then the client may improperly continue the incomplete authentication sequence across the different types of requests. In particular, various Git LFS commands start by making requests to the Locking API of the remote Git LFS endpoint, but Git LFS services are not required to implement that API. If these requests fail, the client just reports that the remote does not support the Locking API and then proceeds to perform the given command, which may require one or more requests to the Batch API of the endpoint. The client may therefore proceed through the initial stages of a long multi-stage authentication sequence, making requests to the Locking API of an endpoint, but reach the maximum number of requests we allow before being either accepted or rejected because the sequence is not yet complete. However, if the client then begins to make requests to the Batch API of the endpoint, it will continue where it left off in the sequence, and may be able to successfully authenticate. While that may appear to be beneficial, this behaviour was not our intention when we added multi-stage authentication support in PR git-lfs#5803. If we permit the client to continue multi-stage authentication across different types of requests, the client may behave inconsistently for users who have identical configurations except for the number of stages in their authentication sequences. We therefore update the DoWithAuth() method so that when it returns after making the maximum allowed number of requests without successfully authenticating or being rejected, it first resets the current state of any multi-stage authentication sequence so that any subsequent requests will restart the sequence from its first stage. To reset the current state we simply pass a "nil" parameter to the SetStateFields() method of the CredentialHelperContext structure from our "creds" package, which sets the structure's "state" field back to its post-initialization value of "nil". To confirm that our changes are effective we then add a new "credentials with multistage auth above limit fails and resets" test to our "t/t-credentials.sh" shell test script. This test establishes a four-stage authentication sequence by creating a credential record file for our "git-credential-lfstest" utility that contains four entries, each of which defines a transition from one stage of four to the next. For this four-stage authentication sequence to be fully supported by our "lfstest-gitserver" test utility, we have to modify its skipIfBadAuth() function, which determines whether to accept or reject the credentials presented in a request's Authorization header. Since commit a3429c3 of PR git-lfs#5803, the skipIfBadAuth() function handles our example "Multistage" authentication scheme by checking whether the credentials match either the value "cred1" or the value "cred2". The "cred2" credential value will result in the function returning "false" and accepting the credential, while the "cred1" value causes the function to return "true" and the utility to respond with a 401 status code. Several of the existing tests in our "t/t-credentials.sh" script depend on the "lfstest-gitserver" utility accepting the "cred2" value in an Authorization header to complete a multi-stage authentication sequence. As our new test requires the utility to not accept the first three of four credentials in a sequence, we alter the skipIfBadAuth() function so that it expects the values in an Authorization header with a "Multistage" scheme to match the pattern "cred<m>of<n>". If a "Multistage" scheme credential matches this pattern and the "<m>" and "<n>" values are equal, the skipIfBadAuth() function will return "false", indicating that the credentials are valid; otherwise, it will return "true" and cause the "lfstest-gitserver" utility to respond to the request with a 401 status code. We then update our existing tests in our "t/t-credentials.sh" script so that they create credential record files whose entries follow the revised pattern, e.g., "cred1of2" and "cred2of2". Our new "credentials with multistage auth above limit fails and resets" test, however, creates entries in its credential record file of the form "cred1of4", "cred2of4", etc. Because the defaultMaxAuthAttempts variable in the "lfsapi" package is assigned a value of three, we allow at most three requests during an authentication sequence. By defining a four-stage authentication sequence for our new test, we guarantee that each distinct type of request made by a Git LFS command should reach the three-request limit without successfully authenticating. Like the other tests we have introduced in previous commits in this PR, our new test establishes its credential record file and then runs a "git push" command. As we described in those commits, we expect the Git LFS client to make two sets of requests during the "git push" command, one set to the Locking API and another to the Batch API. Because the initial request to the Locking API is made without an Authorization header, we expect that no credentials will be retrieved from the local configuration for this request. The client should then make two more requests to the Locking API, both with credentials, before reaching the per-sequence limit of three requests. The client should then proceed to make three requests to the Batch API, all with credentials, before again reaching the per-sequence limit. This expected sequence of events explains why our new test checks that the "git push" command retrieved credentials five times rather than six, and why the test checks for two instances each of the "cred1of4" and "cred2of4" credentials in Authorization headers, but for only one instance of the "cred3of4" credential and none of the "cred4of4" credential. Note that we anticipate that we will further refine the behaviour of the DoWithAuth() method in a subsequent commit in this PR so that requests without an Authorization header are not counted towards the total number of attempts to request authentication. Further, note that our new test would fail without our change to the DoWithAuth() method in this commit, because the Git LFS client would successfully authenticate on its second request to the Batch API. The client would send the "cred3of4" credential on its first request to that API, and then send the "cred4of4" credential on its next request.
In prior commits in this PR we revised the DoWithAuth() method of the Client structure in our "lfsapi" package so that it should no longer enter an infinite loop if a Git LFS API implementation repeatedly responds to requests with a 401 Unauthorized status code. The method now returns an error with the message "too many authentication attempts" after a maximum of three requests receive 401 status code responses. Because the defaultMaxAuthAttempts variable in the "lfsapi" package is assigned a value of three, we currently allow at most three requests during an authentication sequence. As we have described in several of our previous commits, the Git LFS client may make multiple sets of requests during a single command, such as requests to the Locking API and then to the Batch API of a given endpoint. When the Git LFS client makes the first request to an endpoint's Locking API, no access mode for the endpoint has yet been determined (unless the user has explicitly configured an "lfs.<url>.access" option). The client therefore sends a request without an Authorization header, in case the endpoint does not require any credentials at all. If the endpoint does require credentials, it will respond with a 401 Unauthorized status code. The client will now cache the access mode required for for the endpoint, and this cached data applies to the Batch API requests as well. (Specifically, we retain the access mode in the urlAccess map element of an endpointGitFinder structure, which is stored in the Endpoints element of the Client structure of our "lfsapi" package.) In our current version of the DoWithAuth() method we count the initial request made without an Authorization header toward the total number of attempts to request authentication. This can result in inconsistent behaviour if the endpoint requires a multi-stage authentication sequence with the same number of stages as the maximum number of authentication requests that we allow. For example, if a three-stage authentication sequence is configured, then the client will reach the request limit without completing the sequence when making requests to the Locking API, but will complete the sequence when making requests to the Batch API. To avoid this inconsistency between types of requests, we revise the DoWithAuth() method so that if no access mode has yet been determined for the endpoint, the method increases the maximum number of requests it will allow by one. We then update some of our Go and shell tests to reflect this change by incrementing the numbers of requests the tests expect to have been performed by the client when it reaches the maximum number of allowed requests following an initial request without an Authorization header. We also adjust the comments in these shell tests to explain that the initial request without an Authorization header is not counted towards the maximum number of allowed requests for authentication, and concomitantly, we increment the number of times the tests expect the Git LFS client to retrieve credentials in order to make its requests.
f7c88ba to
c323f76
Compare
|
I've added a few commits to the For my own future reference, in the "Comprehensive Mitigation" section of my comment above I made a list of a few potential infinite loop conditions. This PR now specifically addresses two of them:
That leaves two others to be tackled in separate, future PRs:
|
In case of wrong credentials, the method DoWithAuth is executed recursively without termination. This is not such a big problem when users interact with LFS via the command line, but it may cause the process to hang when the command is called automatically (e.g. by CI tool) with credential.helper and core.askpass arguments