Ensures cleanup of temporary index-* generational blobs during snapshotting#21469
Ensures cleanup of temporary index-* generational blobs during snapshotting#21469abeyad merged 5 commits intoelastic:masterfrom
Conversation
index-* blobs are generational files that maintain the snapshots in the repository. To write these atomically, we first write a `pending-index-*` blob, then move it to `index-*`, which also deletes `pending-index-*` in case its not a file-system level move (e.g. S3 repositories) . For example, to write the 5th generation of the index blob for the repository, we would first write the bytes to `pending-index-5` and then move `pending-index-5` to `index-5`. It is possible that we fail after writing `pending-index-5`, but before moving it to `index-5` or deleting `pending-index-5`. In this case, we will have a dangling `pending-index-5` blob laying around. Since snapshot elastic#5 would have failed, the next snapshot assumes a generation number of 5, so it tries to write to `index-5`, which first tries to write to `pending-index-5` before moving the blob to `index-5`. Since `pending-index-5` is leftover from the previous failure, the snapshot fails as it cannot overwrite this blob. This commit solves the problem by first, adding a UUID to the `pending-index-*` blobs, and secondly, strengthen the logic around failure to write the `index-*` generational blob to ensure pending files are deleted on cleanup. Closes elastic#21462
ywelsch
left a comment
There was a problem hiding this comment.
Is it possible to add tests that show that the issue is fixed?
|
|
||
| private void writeAtomic(final String blobName, final BytesReference bytesRef) throws IOException { | ||
| final String tempBlobName = "pending-" + blobName; | ||
| final String tempBlobName = "pending-" + UUIDs.randomBase64UUID() + "-" + blobName; |
There was a problem hiding this comment.
I wonder if it's nicer to append the random uuid.
| // exceptionToThrow will capture this and be thrown at the end | ||
| } catch (IOException e) { | ||
| if (exceptionToThrow == null) { | ||
| exceptionToThrow = e; |
There was a problem hiding this comment.
exceptionToThrow can never be null?
Maybe it's easier to just do the ex.addSuppressed(...)?
| IOException exceptionToThrow = ex; | ||
| try { | ||
| snapshotsBlobContainer.deleteBlob(tempBlobName); | ||
| } catch (NoSuchFileException e) { |
There was a problem hiding this comment.
I don't think we should special-case this. Depending on underlying blobcontainer, a different exception might be thrown.
There was a problem hiding this comment.
Its part of the contract of the BlobContainer to throw a NoSuchFileException if the blob doesn't exist on calling deleteBlob, and we changed all implementations to adhere to this contract. That said, you are right, I went through two iterations of this, and now the only way we get to this block is if an exception was thrown, in which case we don't care about the nature of the exception here. Same with below
|
retest this please |
|
@ywelsch I pushed some commits that add a test and addresses your feedback |
ywelsch
left a comment
There was a problem hiding this comment.
left minor comments (easy to fix). LGTM
| * target blob already exists, an exception is thrown. | ||
| * Renames the source blob into the target blob. If the source blob does not exist or the | ||
| * target blob already exists, an exception is thrown. Atomicity of the move operation is | ||
| * can only be guaranteed on an implementation-by-implementation basis. The only current |
| } catch (IOException ex) { | ||
| // Move failed - try cleaning up | ||
| snapshotsBlobContainer.deleteBlob(tempBlobName); | ||
| try { |
There was a problem hiding this comment.
maybe readd comment here (temp file creation or move failed - clean up)
| // will not have an atomic move, and we should be able to handle that | ||
| maybeIOExceptionOrBlock(targetBlob); | ||
| super.move(sourceBlob, targetBlob); | ||
| super.writeBlob(targetBlob, readBlob(sourceBlob), 0L); |
There was a problem hiding this comment.
super.readBlock instead of readBlock to prevent double maybeIOExceptionOrBlock.
| logger.info("--> creating repository"); | ||
| final Path repoPath = randomRepoPath(); | ||
| assertAcked(client().admin().cluster().preparePutRepository("test-repo").setType("mock").setVerify(false).setSettings( | ||
| Settings.builder().put("location", repoPath).put("random_control_io_exception_rate", 0.2))); |
There was a problem hiding this comment.
maybe add some randomization to the random_control_io_exception_rate?
| assertThat(shardFailure.reason(), containsString("Random IOException")); | ||
| } | ||
| } | ||
| } catch (Exception ex) { |
There was a problem hiding this comment.
can we catch a more specific exception type? IOException with the Random IOException string?
There was a problem hiding this comment.
The issue here is that either SnapshotCreationException or RepositoryException can be thrown, but what we can do here is ensure the stack trace has the Random IOException in the nested stack trace.
There was a problem hiding this comment.
can be specified as catch (SnapshotCreationException | RepositoryException ex) in Java ;-)
|
@ywelsch I pushed a commit to address your latest feedback. Thanks for reviewing! |
…otting (#21469) Ensures pending index-* blobs are deleted when snapshotting. The index-* blobs are generational files that maintain the snapshots in the repository. To write these atomically, we first write a `pending-index-*` blob, then move it to `index-*`, which also deletes `pending-index-*` in case its not a file-system level move (e.g. S3 repositories) . For example, to write the 5th generation of the index blob for the repository, we would first write the bytes to `pending-index-5` and then move `pending-index-5` to `index-5`. It is possible that we fail after writing `pending-index-5`, but before moving it to `index-5` or deleting `pending-index-5`. In this case, we will have a dangling `pending-index-5` blob laying around. Since snapshot #5 would have failed, the next snapshot assumes a generation number of 5, so it tries to write to `index-5`, which first tries to write to `pending-index-5` before moving the blob to `index-5`. Since `pending-index-5` is leftover from the previous failure, the snapshot fails as it cannot overwrite this blob. This commit solves the problem by first, adding a UUID to the `pending-index-*` blobs, and secondly, strengthen the logic around failure to write the `index-*` generational blob to ensure pending files are deleted on cleanup. Closes #21462
…otting (#21469) Ensures pending index-* blobs are deleted when snapshotting. The index-* blobs are generational files that maintain the snapshots in the repository. To write these atomically, we first write a `pending-index-*` blob, then move it to `index-*`, which also deletes `pending-index-*` in case its not a file-system level move (e.g. S3 repositories) . For example, to write the 5th generation of the index blob for the repository, we would first write the bytes to `pending-index-5` and then move `pending-index-5` to `index-5`. It is possible that we fail after writing `pending-index-5`, but before moving it to `index-5` or deleting `pending-index-5`. In this case, we will have a dangling `pending-index-5` blob laying around. Since snapshot #5 would have failed, the next snapshot assumes a generation number of 5, so it tries to write to `index-5`, which first tries to write to `pending-index-5` before moving the blob to `index-5`. Since `pending-index-5` is leftover from the previous failure, the snapshot fails as it cannot overwrite this blob. This commit solves the problem by first, adding a UUID to the `pending-index-*` blobs, and secondly, strengthen the logic around failure to write the `index-*` generational blob to ensure pending files are deleted on cleanup. Closes #21462
| maybeIOExceptionOrBlock(targetBlob); | ||
| super.move(sourceBlob, targetBlob); | ||
| super.writeBlob(targetBlob, super.readBlob(sourceBlob), 0L); | ||
| super.deleteBlob(sourceBlob); |
There was a problem hiding this comment.
sorry, forgot to add here that we could randomize between atomic and non-atomic move.
There was a problem hiding this comment.
I pushed adb7aad to randomize between atomic and non-atomic
* master: ShardActiveResponseHandler shouldn't hold to an entire cluster state Ensures cleanup of temporary index-* generational blobs during snapshotting (#21469) Remove (again) test uses of onModule (#21414) [TEST] Add assertBusy when checking for pending operation counter after tests Revert "Add trace logging when aquiring and releasing operation locks for replication requests" Allows multiple patterns to be specified for index templates (#21009) [TEST] fixes rebalance single shard check as it isn't guaranteed that a rebalance makes sense and the method only tests if rebalance is allowed Document _reindex with random_score
This PR ensures pending
index-*blobs are deleted when snapshotting. Theindex-*blobs are generational files that maintain the snapshotsin the repository. To write these atomically, we first write a
pending-index-*blob, then move it toindex-*, which also deletespending-index-*in case its not a file-system level move (e.g.S3 repositories) . For example, to write the 5th generation of the
index blob for the repository, we would first write the bytes to
pending-index-5and then movepending-index-5toindex-5. It ispossible that we fail after writing
pending-index-5, but beforemoving it to
index-5or deletingpending-index-5. In this case,we will have a dangling
pending-index-5blob laying around. Sincesnapshot number 5 would have failed, the next snapshot assumes a generation
number of 5, so it tries to write to
index-5, which first tries towrite to
pending-index-5before moving the blob toindex-5. Sincepending-index-5is leftover from the previous failure, the snapshotfails as it cannot overwrite this blob.
This commit solves the problem by first, adding a UUID to the
pending-index-*blobs, and secondly, strengthen the logic aroundfailure to write the
index-*generational blob to ensure pendingfiles are deleted on cleanup.
Closes #21462