Skip to content

Fix testRecoveryIsCancelledAfterDeletingTheIndex#76644

Merged
henningandersen merged 1 commit intoelastic:masterfrom
henningandersen:test_fix_cancel_snapshot_recovery
Aug 23, 2021
Merged

Fix testRecoveryIsCancelledAfterDeletingTheIndex#76644
henningandersen merged 1 commit intoelastic:masterfrom
henningandersen:test_fix_cancel_snapshot_recovery

Conversation

@henningandersen
Copy link
Copy Markdown
Contributor

Closes #76560

@henningandersen henningandersen added >test-failure Triaged test failures from CI :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. v8.0.0 v7.15.0 labels Aug 18, 2021
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team. label Aug 18, 2021
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Copy Markdown
Member

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@henningandersen henningandersen merged commit 179a3fc into elastic:master Aug 23, 2021
}
);

assertAcked(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the difference between these two snippets is that we have to register a send behaviour before updating indices settings?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, if we update settings first, we risk the recovery starting or completing before we add the send behavior. That makes the recoverSnapshotFileRequestReceived.await() a few lines down never complete, failing the test.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had an inadvertent race between the call to addRequestHandlingBehavior and the end of the recovery triggered by the update settings call - if the recovery completed before we got around to adding the request handler then we'd never count down the recoverSnapshotFileRequestReceived latch. The fix is to hook into the transport service before we even start the recovery.

wjp719 added a commit to wjp719/elasticsearch that referenced this pull request Aug 24, 2021
* master: (21 commits)
  [Test] More robust assertions for sorting and pagination (elastic#76654)
  [Test] Fix filename check on Windows (elastic#76807)
  Upgrade build scan plugin to 3.6.4 (elastic#76784)
  Remove keystore initial_md5sum (elastic#76835)
  Don't export docker images on assemble (elastic#76817)
  Fix testMasterStatsOnSuccessfulUpdate (elastic#76844)
  AwaitsFix for elastic#76840
  Make Releasing Aggregation Buffers Safer (elastic#76741)
  Re-enable BWC tests after backport of elastic#76771 (elastic#76839)
  Dispatch large bulk requests to write thread  (elastic#76736)
  Disable BWC tests for elastic#76771
  Pull down beats artifacts when performing release tests
  Add timing stats to publication process (elastic#76771)
  Fix BanFailureLoggingTests some more (elastic#76668)
  Mention "warn threshold" in master service slowlog (elastic#76815)
  Fix DockerTests.test010Install
  Re-enable tests affected by elastic#75097 (elastic#76814)
  Fix testRecoveryIsCancelledAfterDeletingTheIndex (elastic#76644)
  Test fix -WildcardFieldMapperTests bad test data. (elastic#76819)
  Updating supported version after backporting the feature (elastic#76794)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/action/bulk/TransportBulkAction.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. Team:Distributed Meta label for distributed team. >test-failure Triaged test failures from CI v7.15.1 v7.16.0 v8.0.0-alpha2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] SnapshotBasedIndexRecoveryIT testRecoveryIsCancelledAfterDeletingTheIndex failing

6 participants