Limit heap used tracking IndexMetadata deletions#140394
Conversation
In elastic#133558 we imposed a limit on the heap used to keep track of shard-level blobs to clean up after the commit of a snapshot deletion. This commit makes use of the same mechanism to track `IndexMetadata` blobs for future deletion. Closes elastic#140018
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
|
Hi @DaveCTurner, I've created a changelog YAML for you. |
DaveCTurner
left a comment
There was a problem hiding this comment.
This change is a little noisy because I've renamed things to avoid referring to shard-level things now that they also include index-level ones.
| private void writeUpdatedShardMetadataAndComputeDeletes(ActionListener<Void> listener) { | ||
| // noinspection resource -- closed safely at the end of the iteration | ||
| final var listeners = new RefCountingListener(listener); | ||
| final var listeners = new RefCountingListener(ActionListener.runBefore(listener, this::recordUnreferencedIndicesMetadata)); |
There was a problem hiding this comment.
NB moves the computation here, before we commit the new RepositoryData blob, because this is the right place to enhance it in future with the idea to list the actual blobs to be deleted. Even before doing that, if we compute these blobs here we can be sure there's no other snapshot deletions ongoing so we don't have to worry about concurrent memory usage.
| final var indexPath = indexPath(indexId).buildAsString(); | ||
| assert indexPath.startsWith(basePath); | ||
| final var truncatedIndexPath = indexPath.substring(basePathLen); |
There was a problem hiding this comment.
Previously we were adding the full path including basePath to each blob and then removing it again later on, but we can cut that out here.
| final Map<IndexId, Collection<String>> toRemove = new HashMap<>(); | ||
| while (indicesForSnapshot.hasNext()) { | ||
| final var indexId = indicesForSnapshot.next(); | ||
| return Iterators.flatMap(indicesToUpdateAfterRemovingSnapshot(snapshotIds), indexId -> { |
There was a problem hiding this comment.
This is the significant change: make this an Iterator so that we don't need to materialize more than one index's worth of blobs at once.
joshua-adams-1
left a comment
There was a problem hiding this comment.
LGTM in general, just a few nits
| .get(); | ||
|
|
||
| final var repo = setupRepo(); | ||
| try (var shardBlobsToDelete = repo.new ShardBlobsToDelete()) { |
There was a problem hiding this comment.
[Nit] The test name and javadoc need updating since it is no longer ShardBlobsToDelete
|
|
||
| final var repo = setupRepo(); | ||
| try (var shardBlobsToDelete = repo.new ShardBlobsToDelete()) { | ||
| try (var shardBlobsToDelete = repo.new BlobsToDelete()) { |
There was a problem hiding this comment.
[Nit] Should this variable still be called shardBlobsToDelete?
| final var shardId = between(1, 30); | ||
| final var shardGeneration = new ShardGeneration(randomUUID()); | ||
|
|
||
| // Always write at least one blob, guaranteeing that the shardDeleteResults stream increases in size |
There was a problem hiding this comment.
[Nit] AFAICT from the git diff, the final var blobsToDelete variable this comment belongs to has moved to line 798 so can this comment go there too? (I also see a similar need for it above line 814). Also I think shardDeleteResults needs renaming
There was a problem hiding this comment.
++ fixed; it's actually not important that we always write one blob because we always e.g. write the IndexId
| final List<String> blobsToDelete; | ||
| final CheckedRunnable<Exception> addResult; | ||
| final UnaryOperator<String> blobNameOperator; | ||
| if (randomBoolean()) { |
There was a problem hiding this comment.
Perhaps a one line comment here explaining what is happening so that the next person who reads the test understands we're testing two branches. Or, and probably my preferred option, would be updating the Javadoc comment above the test
In elastic#133558 we imposed a limit on the heap used to keep track of shard-level blobs to clean up after the commit of a snapshot deletion. This commit makes use of the same mechanism to track `IndexMetadata` blobs for future deletion. Closes elastic#140018
In #133558 we imposed a limit on the heap used to keep track of
shard-level blobs to clean up after the commit of a snapshot deletion.
This commit makes use of the same mechanism to track
IndexMetadatablobs for future deletion.
Closes #140018