ESQL: Fix flaky EmployeeFlightServerTests shutdown race#143657
Merged
costin merged 2 commits intoelastic:mainfrom Mar 5, 2026
Merged
ESQL: Fix flaky EmployeeFlightServerTests shutdown race#143657costin merged 2 commits intoelastic:mainfrom
costin merged 2 commits intoelastic:mainfrom
Conversation
Collaborator
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Use FlightServer.close() for robust gRPC shutdown and add a brief sleep to let Netty event loop callbacks drain before the test ends. Closes elastic#143636
e96c3fa to
82b3bfc
Compare
spinscale
pushed a commit
to spinscale/elasticsearch
that referenced
this pull request
Mar 6, 2026
Fixes a flaky test failure in `EmployeeFlightServerTests.testMultiEndpointReturnsCorrectEndpointCount` caused by a race condition during gRPC/Netty server shutdown. When the test's `FlightClient` closes, gRPC's `NettyServerHandler` processes the disconnect asynchronously on Netty's event loop. A `ChannelFutureListener` callback (`closeStreamWhenDone`) can throw if the HTTP/2 stream was already closed by the time it fires. Netty's `DefaultPromise` logs this as a WARN, which `ESTestCase.checkStaticState()` captures and treats as a test failure. The fix replaces the manual `shutdown()` + `awaitTermination()` with `FlightServer.close()` which provides a more robust shutdown sequence (graceful shutdown → `awaitTermination` → `shutdownNow` fallback), and adds a brief sleep to let any remaining Netty event loop callbacks drain before the test framework checks for logged warnings. Closes elastic#143636
sidosera
pushed a commit
to sidosera/elasticsearch
that referenced
this pull request
Mar 6, 2026
Fixes a flaky test failure in `EmployeeFlightServerTests.testMultiEndpointReturnsCorrectEndpointCount` caused by a race condition during gRPC/Netty server shutdown. When the test's `FlightClient` closes, gRPC's `NettyServerHandler` processes the disconnect asynchronously on Netty's event loop. A `ChannelFutureListener` callback (`closeStreamWhenDone`) can throw if the HTTP/2 stream was already closed by the time it fires. Netty's `DefaultPromise` logs this as a WARN, which `ESTestCase.checkStaticState()` captures and treats as a test failure. The fix replaces the manual `shutdown()` + `awaitTermination()` with `FlightServer.close()` which provides a more robust shutdown sequence (graceful shutdown → `awaitTermination` → `shutdownNow` fallback), and adds a brief sleep to let any remaining Netty event loop callbacks drain before the test framework checks for logged warnings. Closes elastic#143636
dnhatn
pushed a commit
that referenced
this pull request
Mar 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes a flaky test failure in
EmployeeFlightServerTests.testMultiEndpointReturnsCorrectEndpointCountcaused by a race condition during gRPC/Netty server shutdown.When the test's
FlightClientcloses, gRPC'sNettyServerHandlerprocesses the disconnect asynchronously on Netty's event loop. AChannelFutureListenercallback (closeStreamWhenDone) can throw if the HTTP/2 stream was already closed by the time it fires. Netty'sDefaultPromiselogs this as a WARN, whichESTestCase.checkStaticState()captures and treats as a test failure.The fix replaces the manual
shutdown()+awaitTermination()withFlightServer.close()which provides a more robust shutdown sequence (graceful shutdown →awaitTermination→shutdownNowfallback), and adds a brief sleep to let any remaining Netty event loop callbacks drain before the test framework checks for logged warnings.Closes #143636