Skip to content

Fix flaky test RemoteClusterStateServiceTests.testReadClusterStateInParallel_ExceptionDuringRead#19437

Merged
cwperks merged 1 commit intoopensearch-project:mainfrom
cwperks:fix-cluster-state-test
Sep 26, 2025
Merged

Fix flaky test RemoteClusterStateServiceTests.testReadClusterStateInParallel_ExceptionDuringRead#19437
cwperks merged 1 commit intoopensearch-project:mainfrom
cwperks:fix-cluster-state-test

Conversation

@cwperks
Copy link
Copy Markdown
Member

@cwperks cwperks commented Sep 26, 2025

Description

This PR makes a simple change to throw a new instance of an IOException every time container.readBlob is invoked instead of re-using the same instance.

This PR is good from a hygiene perspective to not re-use the same object every time a new exception is thrown, but the underlying issue in apache/logging-log4j2#3940 and apache/logging-log4j2#3933 should still be fixed.

This PR is not meant as a replacement for #19430. I'm opening this up to share the reason for the test failure in this particular test case.

The anomaly-detection repo is also seeing this on ./gradlew test --tests MultiEntityResultTests.testQueryErrorEndRunNotNow -i

Related Issues

Resolves #19325

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…arallel_ExceptionDuringRead

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@github-actions github-actions bot added >test-failure Test failure from CI, local build, etc. autocut ClusterManager:RemoteState flaky-test Random test failure that succeeds on second run labels Sep 26, 2025
Copy link
Copy Markdown
Member

@andrross andrross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this should be merged. Exception instances are mutable and generally should not be reused.

@cwperks cwperks marked this pull request as ready for review September 26, 2025 15:36
@cwperks cwperks requested a review from a team as a code owner September 26, 2025 15:36
@github-actions
Copy link
Copy Markdown
Contributor

❕ Gradle check result for 3402fa9: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@codecov
Copy link
Copy Markdown

codecov bot commented Sep 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.97%. Comparing base (e5d01b5) to head (3402fa9).
⚠️ Report is 11 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #19437      +/-   ##
============================================
- Coverage     73.03%   72.97%   -0.07%     
+ Complexity    69997    69911      -86     
============================================
  Files          5676     5676              
  Lines        320923   320923              
  Branches      46392    46392              
============================================
- Hits         234396   234180     -216     
- Misses        67610    67766     +156     
- Partials      18917    18977      +60     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cwperks cwperks merged commit f67ecbf into opensearch-project:main Sep 26, 2025
41 of 42 checks passed
karenyrx pushed a commit to karenyrx/OpenSearch that referenced this pull request Sep 29, 2025
…arallel_ExceptionDuringRead (opensearch-project#19437)

Signed-off-by: Craig Perkins <cwperx@amazon.com>
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Oct 15, 2025
…arallel_ExceptionDuringRead (opensearch-project#19437)

Signed-off-by: Craig Perkins <cwperx@amazon.com>
cwperks added a commit that referenced this pull request Dec 27, 2025
* Bump log4j from 2.21.0 to 2.25.3 (#20308)

* Bump log4j from 2.21.0 to 2.25.3

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Add CHANGELOG entry

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Filter out known messages

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Fix arg count

Signed-off-by: Craig Perkins <cwperx@amazon.com>

---------

Signed-off-by: Craig Perkins <cwperx@amazon.com>
(cherry picked from commit e047211)

* Update plugins/transport-grpc/build.gradle

Co-authored-by: Andriy Redko <andriy.redko@aiven.io>
Signed-off-by: Craig Perkins <craig5008@gmail.com>

* Remove unused files

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Broaden perm

Signed-off-by: Craig Perkins <cwperx@amazon.com>

* Fix flaky test RemoteClusterStateServiceTests.testReadClusterStateInParallel_ExceptionDuringRead (#19437)

Signed-off-by: Craig Perkins <cwperx@amazon.com>

---------

Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <craig5008@gmail.com>
Co-authored-by: Andriy Redko <andriy.redko@aiven.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

autocut ClusterManager:RemoteState flaky-test Random test failure that succeeds on second run skip-changelog >test-failure Test failure from CI, local build, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[AUTOCUT] Gradle Check Flaky Test Report for RemoteClusterStateServiceTests

2 participants