Shorten time that snapshot finalization blocks repo by original-brownbear · Pull Request #89572 · elastic/elasticsearch

original-brownbear · 2022-08-24T11:00:01Z

For large snapshot clusters the final cleanup step that removes
previous shard metadata might have to delete tens of thousands of blobs.
This can take many seconds, for which the repository is needlessly blocked
from running subsequent finalizations or deletes.
Similar to how the final cleanup step was made async and taken out of the
finalization loop for deletes in #86514 this PR moves the final
cleanup step for finalization async.

This produces a measurable speedup in the many-shards snapshotting benchmark run of a couple minutes
saved end-to-end for creating 100 snapshots in a 25k indices cluster.

relates #77466

For large snapshot clusters the final cleanup step that removes previous shard metadata might have to delete tens of thousands of blobs. This can take many seconds, for which the repository is needlessly blocked from running subsequent finalizations or deletes. Similar to how the final cleanup step was made async and taken out of the finalization loop for deletes in #86514 this PR moves the final cleanup step for finalization async.

elasticsearchmachine · 2022-08-24T11:00:24Z

Pinging @elastic/es-distributed (Team:Distributed)

fcofdez

Looks good to me, but I left a question, just to ensure that I understand the change correctly.

fcofdez · 2022-08-25T07:32:22Z

server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java

-        } catch (Exception e) {
-            logger.warn("Failed to clean up old shard generation blobs", e);
+        if (toDelete.isEmpty() == false) {
+            threadPool.executor(ThreadPool.Names.SNAPSHOT).execute(() -> {


Since we compute the list of files to delete based on the old/new repository data, would it be possible for two delete operations to try and delete an intersection of files? I guess that will be fine, the only minor issue I can see is that we might end up having more queued tasks in the SNAPSHOT thread pool, but the risk should be minimal?

would it be possible for two delete operations to try and delete an intersection of files?

No this should not happen since we compute the files to delete from the repo data diff. So if a file A goes out of scope when moving from generation 1 to 2, then it will never show up in generation 2 and thus won't be in the diff from 2 to 3.

I guess that will be fine, the only minor issue I can see is that we might end up having more queued tasks in the SNAPSHOT thread pool, but the risk should be minimal?

This is actually something that happens in the many-shards snapshot benchmark. Assuming you have back to back snapshot finalizations, it's really a function of how fast cluster state updates are vs how fast those deletes are. But it's not really an issue IMO, we're still only talking about O(10s) tasks here in the worst cases (GCP + 50k indices).

fcofdez

LGTM. Thanks for the clarifications!

original-brownbear · 2022-08-25T08:09:09Z

Thanks Francisco!

tlrx · 2022-08-25T08:04:33Z

server/src/main/java/org/elasticsearch/repositories/FinalizeSnapshotContext.java

    }

+    public void onDone(SnapshotInfo snapshotInfo) {
+        onDone.accept(snapshotInfo);


Should we assert that onResponse() has been called at this stage? I also wonder if renaming the method to something like afterCleanUp() would be clearer?

Should we assert that onResponse() has been called at this stage?

I wanted to do that, but this isn't actually 100% true for the S3 waiting implementation it turns out ... which is kind of stupid, I'll open a discussion about removing it :)

Ok, thanks for the precision

tlrx · 2022-08-25T08:12:21Z

server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java

+                } catch (Exception e) {
+                    logger.warn("Failed to clean up old shard generation blobs", e);
+                } finally {
+                    finalizeSnapshotContext.onDone(snapshotInfo);


Do we really need to pass the SnapshotInfo instance back to where it is originated from? 🤔

Suggested change

finalizeSnapshotContext.onDone(snapshotInfo);

finalizeSnapshotContext.afterCleanUp();

Hmm that is a good point, I just realized we don't need to pass it back here indeed ... we used to need to do that because we computed SnapshotInfo in the repository but that's not the case anymore. I'll follow up with a simplification PR, thanks for spotting this!

This fixes a minor bug introduced in #89572. We need to take the listeners for the snapshot out of the listeners map for the repo right away before trying to run the next repo operation like we did prior to that change. Otherwise we run the risk of a fatal exception failing all operations for the repo like a no-longer-master exception did in #89625 even though the snapshot completed successfully. closes #89625

original-brownbear added >non-issue :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.5.0 labels Aug 24, 2022

elasticsearchmachine added the Team:Distributed Meta label for distributed team. label Aug 24, 2022

original-brownbear requested review from fcofdez and tlrx August 24, 2022 15:48

fcofdez reviewed Aug 25, 2022

View reviewed changes

original-brownbear requested a review from fcofdez August 25, 2022 07:43

fcofdez approved these changes Aug 25, 2022

View reviewed changes

original-brownbear merged commit bdbfcb3 into elastic:main Aug 25, 2022

original-brownbear deleted the faster-snapshot-finalization branch August 25, 2022 08:09

tlrx reviewed Aug 25, 2022

View reviewed changes

original-brownbear mentioned this pull request Aug 26, 2022

Fix ConcurrentSnapshotsIT.testMasterFailoverOnFinalizationLoop #89648

Merged

original-brownbear restored the faster-snapshot-finalization branch April 18, 2023 21:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shorten time that snapshot finalization blocks repo#89572

Shorten time that snapshot finalization blocks repo#89572
original-brownbear merged 1 commit intoelastic:mainfrom
original-brownbear:faster-snapshot-finalization

original-brownbear commented Aug 24, 2022

Uh oh!

elasticsearchmachine commented Aug 24, 2022

Uh oh!

fcofdez left a comment

Uh oh!

fcofdez Aug 25, 2022

Uh oh!

original-brownbear Aug 25, 2022

Uh oh!

fcofdez left a comment

Uh oh!

original-brownbear commented Aug 25, 2022

Uh oh!

tlrx Aug 25, 2022

Uh oh!

original-brownbear Aug 25, 2022

Uh oh!

tlrx Aug 25, 2022

Uh oh!

tlrx Aug 25, 2022

Uh oh!

original-brownbear Aug 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	finalizeSnapshotContext.onDone(snapshotInfo);
	finalizeSnapshotContext.afterCleanUp();

Conversation

original-brownbear commented Aug 24, 2022

Uh oh!

elasticsearchmachine commented Aug 24, 2022

Uh oh!

fcofdez left a comment

Choose a reason for hiding this comment

Uh oh!

fcofdez Aug 25, 2022

Choose a reason for hiding this comment

Uh oh!

original-brownbear Aug 25, 2022

Choose a reason for hiding this comment

Uh oh!

fcofdez left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear commented Aug 25, 2022

Uh oh!

tlrx Aug 25, 2022

Choose a reason for hiding this comment

Uh oh!

original-brownbear Aug 25, 2022

Choose a reason for hiding this comment

Uh oh!

tlrx Aug 25, 2022

Choose a reason for hiding this comment

Uh oh!

tlrx Aug 25, 2022

Choose a reason for hiding this comment

Uh oh!

original-brownbear Aug 25, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants