Speed up Snapshot Finalization by original-brownbear · Pull Request #47283 · elastic/elasticsearch

original-brownbear · 2019-09-30T09:36:44Z

As a result of #45689 snapshot finalization started to
take significantly longer than before. This may be a
little unfortunate since it increases the likelihood
of failing to finalize after having written out all
the segment blobs.
This change parallelizes all the metadata writes that
can safely run in parallel in the finalization step to
speed the finalization step up again. Also, this will
generally speed up the snapshot process overall in case
of large number of indices.

This is also a nice to have for #46250 since we add yet
another step (deleting of old index- blobs in the shards
to the finalization.

As a result of #45689 snapshot finalization started to take significantly longer than before. This may be a little unfortunate since it increases the likelihood of failing to finalize after having written out all the segment blobs. This change parallelizes all the metadata writes that can safely run in parallel in the finalization step to speed the finalization step up again. Also, this will generally speed up the snapshot process overall in case of large number of indices.

elasticmachine · 2019-09-30T09:36:46Z

Pinging @elastic/es-distributed

original-brownbear · 2019-09-30T10:01:44Z

Jenkins run elasticsearch-ci/bwc

original-brownbear · 2019-09-30T10:28:24Z

Jenkins run elasticsearch-ci/bwc

original-brownbear · 2019-09-30T10:29:23Z

server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java

-            final RepositoryData updatedRepositoryData = getRepositoryData().addSnapshot(snapshotId, blobStoreSnapshot.state(), indices);
-            snapshotFormat.write(blobStoreSnapshot, blobContainer(), snapshotId.getUUID(), false);
-            writeIndexGen(updatedRepositoryData, repositoryStateId);
-        } catch (FileAlreadyExistsException ex) {


This catch is gone now, it was dead code because we don't do the exists check for this blob anymore in the line above where we write the snap- blob.

original-brownbear · 2019-09-30T10:31:27Z

server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java

                indexMetaDataFormat.write(clusterMetaData.index(index.getName()), indexContainer(index), snapshotId.getUUID(), false);
-            }
-        } catch (IOException ex) {
-            throw new SnapshotException(metadata.name(), snapshotId, "failed to write metadata for snapshot", ex);


I removed this specific rethrow because we write the index meta in parallel to the root level snap- blob with this change anyway so throwing with a separate message here seemed pointless.

original-brownbear · 2019-09-30T10:32:23Z

...test/java/org/elasticsearch/snapshots/mockstore/MockEventuallyConsistentRepositoryTests.java


 public class MockEventuallyConsistentRepositoryTests extends ESTestCase {

-    private Environment environment;


This is just dead-code. Saw it when making adjustments here and just removed it when because I figured it wasn't worth a separate PR.

tlrx

I left some comments, nothing to worry about as it looks great already

server/src/main/java/org/elasticsearch/repositories/Repository.java

server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java

server/src/main/java/org/elasticsearch/snapshots/SnapshotsService.java

x-pack/plugin/core/src/main/java/org/elasticsearch/snapshots/SourceOnlySnapshotRepository.java

…alization

original-brownbear · 2019-09-30T13:31:37Z

Thanks @tlrx , all points addressed I think :)

tlrx

LGTM, nice change

original-brownbear · 2019-09-30T14:49:53Z

Jenkins run elasticsearch-ci/packaging-sample

original-brownbear · 2019-09-30T15:54:36Z

Thanks Tanguy!

As a result of #45689 snapshot finalization started to take significantly longer than before. This may be a little unfortunate since it increases the likelihood of failing to finalize after having written out all the segment blobs. This change parallelizes all the metadata writes that can safely run in parallel in the finalization step to speed the finalization step up again. Also, this will generally speed up the snapshot process overall in case of large number of indices. This is also a nice to have for #46250 since we add yet another step (deleting of old index- blobs in the shards to the finalization.

This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated.

This pull request is a backport of elastic/elasticsearch#47283 The purpose of this pull request is to speed up the snapshot finalization. This is archived by parallelizing the writes of the metadata in the snapshot finalization step. Also, this will generally speed up the snapshot process overall in case of large number of indices. This improvement makes sense, because the snapshot finalization takes much longer since #9327 is integrated. (cherry picked from commit 3091e26)

original-brownbear added >non-issue :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.5.0 labels Sep 30, 2019

original-brownbear requested review from andrershov, tlrx and ywelsch September 30, 2019 10:14

original-brownbear commented Sep 30, 2019

View reviewed changes

tlrx reviewed Sep 30, 2019

View reviewed changes

original-brownbear added 2 commits September 30, 2019 15:28

Merge remote-tracking branch 'elastic/master' into parallelize-sn-fin…

18ef116

…alization

CR: typo and reword

0f52e53

original-brownbear requested a review from tlrx September 30, 2019 13:31

siomplify callback chain

16d558a

tlrx approved these changes Sep 30, 2019

View reviewed changes

original-brownbear merged commit 5405f2e into elastic:master Sep 30, 2019

original-brownbear deleted the parallelize-sn-finalization branch September 30, 2019 15:54

original-brownbear mentioned this pull request Sep 30, 2019

Speed up Snapshot Finalization (#47283) #47309

Merged

original-brownbear mentioned this pull request Oct 2, 2019

SharedClusterSnapshotRestoreIT#testDeleteSnapshot fails #47425

Closed

mkleen mentioned this pull request Nov 27, 2019

Speed up Snapshot Finalization crate/crate#9384

Merged

5 tasks

original-brownbear restored the parallelize-sn-finalization branch January 6, 2021 14:08

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021


		public class MockEventuallyConsistentRepositoryTests extends ESTestCase {

		private Environment environment;

Conversation

original-brownbear commented Sep 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Sep 30, 2019

Uh oh!

original-brownbear commented Sep 30, 2019

Uh oh!

original-brownbear commented Sep 30, 2019

Uh oh!

original-brownbear Sep 30, 2019

Choose a reason for hiding this comment

Uh oh!

original-brownbear Sep 30, 2019

Choose a reason for hiding this comment

Uh oh!

original-brownbear Sep 30, 2019

Choose a reason for hiding this comment

Uh oh!

tlrx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

original-brownbear commented Sep 30, 2019

Uh oh!

tlrx left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear commented Sep 30, 2019

Uh oh!

original-brownbear commented Sep 30, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

original-brownbear commented Sep 30, 2019 •

edited

Loading