[TINKERPOP-3061] fix: failing authentication when multiple initially requests are executed concurrently#2525
Conversation
fa43e9a to
b65004e
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 3.7-dev #2525 +/- ##
=============================================
+ Coverage 76.14% 76.46% +0.32%
- Complexity 13152 13173 +21
=============================================
Files 1084 1059 -25
Lines 65160 61305 -3855
Branches 7285 7303 +18
=============================================
- Hits 49616 46877 -2739
+ Misses 12839 11907 -932
+ Partials 2705 2521 -184 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
b65004e to
c003169
Compare
|
I think this is a good solution for concurrent auth issue. Good to have test for js driver to be 100% sure. |
c003169 to
053d329
Compare
|
Have added 1 changelog entry & a test to the JS driver 🙏 |
Thank you @tien! VOTE +1 |
|
Thanks @tien, looks great. VOTE +1 |
|
Could you also add a test where the authentication fails with multiple pending requests and you check that all requests get the proper exception in that case? |
|
I agree with @kenhuuu that an additional test is warranted here to ensure that the server will always send a response to every request. We are now entering code freeze week in preparation for the 3.6.7 and 3.7.2 releases. I believe it is fair to grant an exception for a few days to give some time for such a test to be implemented and to ensure this PR can be included in the release. |
053d329 to
b66bbe9
Compare
|
I've added one test for the unhappy path. |
…uted concurrently
b66bbe9 to
321c327
Compare
FlorianHockmann
left a comment
There was a problem hiding this comment.
Just reviewed the changes. I like the general approach taken here! It means that we don't have to implement workarounds in all GLVs for this like I initially did in #2522. Thanks a lot for tackling this, @tien!
I added some inline comments, but they are all basically only about code style issues.
I also tested this fix with the test that I wrote for the .NET driver in #2522 and it passed! :-)
|
|
||
| import java.util.ArrayList; | ||
| import java.util.List; | ||
| import java.util.*; |
There was a problem hiding this comment.
Please revert this change. Our dev docs explicitly mention that TinkerPop doesn't use wildcard imports.
|
|
||
| static class CallbackResponseHandler extends SimpleChannelInboundHandler<ResponseMessage> { | ||
| public Consumer<ResponseMessage> callback; | ||
| public Map<UUID, Consumer<ResponseMessage>> callback = new HashMap<>(); |
There was a problem hiding this comment.
(nitpick) I think we should rename this now that it's no longer just a callback. Maybe something like callbackByRequestId?
| } | ||
|
|
||
| // If authentication negotiation is pending, store subsequent non-authentication requests for later processing | ||
| if (negotiator.get() != null && !requestMessage.getOp().equals(Tokens.OPS_AUTHENTICATION)) { |
There was a problem hiding this comment.
(nitpick) Isn't negotiator.get() != null duplicate code since we have an if (negotiator.get() == null) right before that returns ?
| if (deferredDuration.compareTo(MAX_REQUEST_DEFERRABLE_DURATION) > 0) { | ||
| respondWithError( | ||
| requestMessage, | ||
| builder -> builder.statusMessage("Too many unauthenticated requests").code(ResponseStatusCode.TOO_MANY_REQUESTS), |
There was a problem hiding this comment.
Is the problem here really that there are too many unauthenticated requests? Isn't the problem that authentication took longer than MAX_REQUEST_DEFERRABLE_DURATION?
I think as a user it might be good to know whether I simply submitted too many requests or whether authentication is just too slow.
There was a problem hiding this comment.
I'm stumped on this 2, do you have a recommendation on what status code & message would make sense here?
There was a problem hiding this comment.
Wouldn't UNAUTHORIZED be correct in this case? The description starts with:
The server could not authenticate the request
which makes sense in my opinion if the max duration was passed for the authentication to happen.
And the status message can then contain more detailed information about the duration that was passed, maybe something like: "authentication did not finish in the allowed duration (" + MAX_REQUEST_DEFERRABLE_DURATION + " s)"?
| @@ -218,9 +266,19 @@ public void shouldAuthenticateAndWorkWithVariablesOverGraphSONV1Serialization() | |||
|
|
|||
| private static void assertConnection(final Cluster cluster, final Client client) throws InterruptedException, ExecutionException { | |||
There was a problem hiding this comment.
This method is used in 4 different tests, such as shouldAuthenticateWithPlainText. These 4 tests will now fail if submitting multiple requests initially in parallel isn't working.
I think it would be good if we could keep these tests as simple as possible so they don't include parallelization of initial requests. A test like shouldAuthenticateWithPlainText should really only fail if authenticate with plain text isn't working, not if submitting multiple requests in parallel isn't working.
Long story short, I think it would be good if you could revert the changes to this method and instead write a new test specifically for the parallelization issue.
| responses.addAll(future4.join()); | ||
|
|
||
| for (ResponseMessage response : responses) { | ||
| if (response.getStatus().getCode() != ResponseStatusCode.AUTHENTICATE) { |
There was a problem hiding this comment.
I think it's a bit hard to understand what the expected outcome is here / what this is really asserting since it's just iterating over all received responses and then either accepting them if their status code is AUTHENTICATE or if it's TOO_MANY_REQUESTS.
Can't we explicitly assert which request should get which response status code? I guess future4 should get TOO_MANY_REQUESTS and the other 3 should get AUTHENTICATE (?).
Not that important, but I think it would also improve readability of this test if these weren't named future1 - future4, but maybe something like futureOfRequestWithinAuthDuration vs futureOfRequestSubmittedTooLate or something like that.
| } | ||
|
|
||
| @Test | ||
| public void shouldFailAuthenticateWithUnAuthenticatedRequestAfterMaxDeferrableDuration() throws Exception { |
There was a problem hiding this comment.
Hi Tien, I want to be absolutely certain that we aren’t going to lose any of these deferred requests in the case of errors. If the server fails to send a response the drivers will just be left hanging indefinitely. If I’m understanding this test right, the first 3 requests are all expected to succeed, and then after a delay the final request is submitted and fails. Would it also be possible to setup a test such that there are multiple pending requests in the server’s deferred requests queue at the time that auth fails, and then we can verify that the correct error message gets sent to all currently pending requests?
|
Sorry, I'm away this week and don't have access to my work laptop. Will take a look at all the pending comments & resolve them next week 🙏 |
No worries, thank you for all the contributions! Just a quick note. Not sure if you have gotten a chance to start looking at the comments, as we'd like to release this with 3.7.2 this week, we will likely be cherry-picking your changes into another PR for the release branch today. If we do proceed with that we'll be closing this PR, and you shouldn't need to do any further work. Now there might still be functionality improvements we miss, so please feel free to add additional changes once the branches re-open. |
|
Closing this PR as all changes are merged via 22db8cf for the release. Please feel free to re-open if you find additional improvements and/or updates needed. Thanks! |
|
@xiazcy nah I just got back today, thanks for making this available in the next release 💪 |
This solution try to resolve the concurrent initial unauthenticated requests problem described in TINKERPOP-3063, TINKERPOP-2132 & TINKERPOP-3061 by batching them for later processing when authentication handshake is in progress.