[TINKERPOP-3061] fix: failing authentication when multiple initially requests are executed concurrently by tien · Pull Request #2525 · apache/tinkerpop

tien · 2024-03-16T05:59:29Z

This solution try to resolve the concurrent initial unauthenticated requests problem described in TINKERPOP-3063, TINKERPOP-2132 & TINKERPOP-3061 by batching them for later processing when authentication handshake is in progress.

codecov-commenter · 2024-03-16T10:48:40Z

Codecov Report

❌ Patch coverage is 35.55556% with 58 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.46%. Comparing base (9b46b67) to head (321c327).
⚠️ Report is 453 commits behind head on 3.7-dev.

Files with missing lines	Patch %	Lines
...mlin/server/handler/SaslAuthenticationHandler.java	36.90%	42 Missing and 11 partials ⚠️
...inkerpop/gremlin/driver/simple/AbstractClient.java	0.00%	5 Missing ⚠️

Additional details and impacted files

@@              Coverage Diff              @@
##             3.7-dev    #2525      +/-   ##
=============================================
+ Coverage      76.14%   76.46%   +0.32%     
- Complexity     13152    13173      +21     
=============================================
  Files           1084     1059      -25     
  Lines          65160    61305    -3855     
  Branches        7285     7303      +18     
=============================================
- Hits           49616    46877    -2739     
+ Misses         12839    11907     -932     
+ Partials        2705     2521     -184

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

vkagamlyk · 2024-03-19T02:40:32Z

I think this is a good solution for concurrent auth issue.

Good to have test for js driver to be 100% sure.
Also missing changelog entry

tien · 2024-03-20T22:06:24Z

Have added 1 changelog entry & a test to the JS driver 🙏

vkagamlyk · 2024-03-20T22:43:57Z

Have added 1 changelog entry & a test to the JS driver 🙏

Thank you @tien!

VOTE +1

Cole-Greer · 2024-03-25T19:54:23Z

Thanks @tien, looks great. VOTE +1

kenhuuu · 2024-03-25T21:52:56Z

Could you also add a test where the authentication fails with multiple pending requests and you check that all requests get the proper exception in that case?

Cole-Greer · 2024-03-26T00:27:35Z

I agree with @kenhuuu that an additional test is warranted here to ensure that the server will always send a response to every request. We are now entering code freeze week in preparation for the 3.6.7 and 3.7.2 releases. I believe it is fair to grant an exception for a few days to give some time for such a test to be implemented and to ensure this PR can be included in the release.

tien · 2024-03-26T14:34:08Z

I've added one test for the unhappy path.

…uted concurrently

FlorianHockmann

Just reviewed the changes. I like the general approach taken here! It means that we don't have to implement workarounds in all GLVs for this like I initially did in #2522. Thanks a lot for tackling this, @tien!

I added some inline comments, but they are all basically only about code style issues.
I also tested this fix with the test that I wrote for the .NET driver in #2522 and it passed! :-)

FlorianHockmann · 2024-03-27T10:17:21Z


-import java.util.ArrayList;
-import java.util.List;
+import java.util.*;


Please revert this change. Our dev docs explicitly mention that TinkerPop doesn't use wildcard imports.

FlorianHockmann · 2024-03-27T10:45:00Z


    static class CallbackResponseHandler extends SimpleChannelInboundHandler<ResponseMessage> {
-        public Consumer<ResponseMessage> callback;
+        public Map<UUID, Consumer<ResponseMessage>> callback = new HashMap<>();


(nitpick) I think we should rename this now that it's no longer just a callback. Maybe something like callbackByRequestId?

FlorianHockmann · 2024-03-27T11:01:30Z

        }
+
+        // If authentication negotiation is pending, store subsequent non-authentication requests for later processing
+        if (negotiator.get() != null && !requestMessage.getOp().equals(Tokens.OPS_AUTHENTICATION)) {


(nitpick) Isn't negotiator.get() != null duplicate code since we have an if (negotiator.get() == null) right before that returns ?

FlorianHockmann · 2024-03-27T11:06:13Z

+            if (deferredDuration.compareTo(MAX_REQUEST_DEFERRABLE_DURATION) > 0) {
+                respondWithError(
+                    requestMessage,
+                    builder -> builder.statusMessage("Too many unauthenticated requests").code(ResponseStatusCode.TOO_MANY_REQUESTS),


Is the problem here really that there are too many unauthenticated requests? Isn't the problem that authentication took longer than MAX_REQUEST_DEFERRABLE_DURATION?
I think as a user it might be good to know whether I simply submitted too many requests or whether authentication is just too slow.

I'm stumped on this 2, do you have a recommendation on what status code & message would make sense here?

Wouldn't UNAUTHORIZED be correct in this case? The description starts with:

The server could not authenticate the request

which makes sense in my opinion if the max duration was passed for the authentication to happen.

And the status message can then contain more detailed information about the duration that was passed, maybe something like: "authentication did not finish in the allowed duration (" + MAX_REQUEST_DEFERRABLE_DURATION + " s)"?

FlorianHockmann · 2024-03-27T11:20:37Z

@@ -218,9 +266,19 @@ public void shouldAuthenticateAndWorkWithVariablesOverGraphSONV1Serialization()

    private static void assertConnection(final Cluster cluster, final Client client) throws InterruptedException, ExecutionException {


This method is used in 4 different tests, such as shouldAuthenticateWithPlainText. These 4 tests will now fail if submitting multiple requests initially in parallel isn't working.
I think it would be good if we could keep these tests as simple as possible so they don't include parallelization of initial requests. A test like shouldAuthenticateWithPlainText should really only fail if authenticate with plain text isn't working, not if submitting multiple requests in parallel isn't working.

Long story short, I think it would be good if you could revert the changes to this method and instead write a new test specifically for the parallelization issue.

FlorianHockmann · 2024-03-27T11:44:50Z

+            responses.addAll(future4.join());
+
+            for (ResponseMessage response : responses) {
+                if (response.getStatus().getCode() != ResponseStatusCode.AUTHENTICATE) {


I think it's a bit hard to understand what the expected outcome is here / what this is really asserting since it's just iterating over all received responses and then either accepting them if their status code is AUTHENTICATE or if it's TOO_MANY_REQUESTS.
Can't we explicitly assert which request should get which response status code? I guess future4 should get TOO_MANY_REQUESTS and the other 3 should get AUTHENTICATE (?).

Not that important, but I think it would also improve readability of this test if these weren't named future1 - future4, but maybe something like futureOfRequestWithinAuthDuration vs futureOfRequestSubmittedTooLate or something like that.

Cole-Greer · 2024-03-28T01:13:26Z

    }

+    @Test
+    public void shouldFailAuthenticateWithUnAuthenticatedRequestAfterMaxDeferrableDuration() throws Exception {


Hi Tien, I want to be absolutely certain that we aren’t going to lose any of these deferred requests in the case of errors. If the server fails to send a response the drivers will just be left hanging indefinitely. If I’m understanding this test right, the first 3 requests are all expected to succeed, and then after a delay the final request is submitted and fails. Would it also be possible to setup a test such that there are multiple pending requests in the server’s deferred requests queue at the time that auth fails, and then we can verify that the correct error message gets sent to all currently pending requests?

tien · 2024-04-04T01:41:31Z

Sorry, I'm away this week and don't have access to my work laptop. Will take a look at all the pending comments & resolve them next week 🙏

xiazcy · 2024-04-08T22:40:22Z

Sorry, I'm away this week and don't have access to my work laptop. Will take a look at all the pending comments & resolve them next week 🙏

No worries, thank you for all the contributions!

Just a quick note. Not sure if you have gotten a chance to start looking at the comments, as we'd like to release this with 3.7.2 this week, we will likely be cherry-picking your changes into another PR for the release branch today. If we do proceed with that we'll be closing this PR, and you shouldn't need to do any further work.

Now there might still be functionality improvements we miss, so please feel free to add additional changes once the branches re-open.

xiazcy · 2024-04-09T02:02:16Z

Closing this PR as all changes are merged via 22db8cf for the release. Please feel free to re-open if you find additional improvements and/or updates needed. Thanks!

tien · 2024-04-09T02:07:33Z

@xiazcy nah I just got back today, thanks for making this available in the next release 💪

tien force-pushed the fix/failing-initial-authentication branch from fa43e9a to b65004e Compare March 16, 2024 10:26

tien force-pushed the fix/failing-initial-authentication branch from b65004e to c003169 Compare March 16, 2024 11:00

tien force-pushed the fix/failing-initial-authentication branch from c003169 to 053d329 Compare March 20, 2024 22:05

Cole-Greer mentioned this pull request Mar 26, 2024

TINKERPOP-3063 Fix bug in Gremlin.Net authentication for parallel requests #2522

Closed

tien force-pushed the fix/failing-initial-authentication branch from 053d329 to b66bbe9 Compare March 26, 2024 14:33

fix: failing authentication when multiple initially requests are exec…

321c327

…uted concurrently

tien force-pushed the fix/failing-initial-authentication branch from b66bbe9 to 321c327 Compare March 27, 2024 03:45

FlorianHockmann reviewed Mar 27, 2024

View reviewed changes

Cole-Greer reviewed Mar 28, 2024

View reviewed changes

FlorianHockmann mentioned this pull request Mar 28, 2024

When use gremlin session client will cause Failed to authenticate JanusGraph/janusgraph#4206

Open

kenhuuu mentioned this pull request Apr 9, 2024

Parallel Authentication Fix #2551

Merged

xiazcy closed this Apr 9, 2024

		@@ -218,9 +266,19 @@ public void shouldAuthenticateAndWorkWithVariablesOverGraphSONV1Serialization()

		private static void assertConnection(final Cluster cluster, final Client client) throws InterruptedException, ExecutionException {

Conversation

tien commented Mar 16, 2024

Uh oh!

codecov-commenter commented Mar 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vkagamlyk commented Mar 19, 2024

Uh oh!

tien commented Mar 20, 2024

Uh oh!

vkagamlyk commented Mar 20, 2024

Uh oh!

Cole-Greer commented Mar 25, 2024

Uh oh!

kenhuuu commented Mar 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cole-Greer commented Mar 26, 2024

Uh oh!

tien commented Mar 26, 2024

Uh oh!

FlorianHockmann left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tien commented Apr 4, 2024

Uh oh!

xiazcy commented Apr 8, 2024

Uh oh!

xiazcy commented Apr 9, 2024

Uh oh!

tien commented Apr 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

codecov-commenter commented Mar 16, 2024 •

edited

Loading

kenhuuu commented Mar 25, 2024 •

edited

Loading