Simplify and optimize deduplication of RepositoryData for a non-caching repository instance#91851
Simplify and optimize deduplication of RepositoryData for a non-caching repository instance#91851original-brownbear merged 6 commits intoelastic:mainfrom original-brownbear:89952
Conversation
…ng repository instance This makes use of the new deduplicator infrastructure to move to more efficient deduplication mechanics. The existing solution hardly ever deduplicated because it would only deduplicate after the repository entered a consistent state. The adjusted solution is much simpler, in that it simply deduplicates such that only a single loading of `RepositoryData` will ever happen at a time, fixing memory issues from massively concurrent loading of the repo data as described in #89952. closes #89952
|
Pinging @elastic/es-distributed (Team:Distributed) |
|
Hi @original-brownbear, I've created a changelog YAML for you. |
DaveCTurner
left a comment
There was a problem hiding this comment.
Nice simplification. I left one small comment.
| this.basePath = basePath; | ||
| this.maxSnapshotCount = MAX_SNAPSHOTS_SETTING.get(metadata.settings()); | ||
| this.repoDataDeduplicator = new ResultDeduplicator<>(threadPool.getThreadContext()); | ||
| this.repoDataLoadDeduplicator = new SingleResultDeduplicator<>(threadPool.getThreadContext(), this::doGetRepositoryData); |
There was a problem hiding this comment.
I think SingleResultDeduplicator risks a stack overflow exception if the action doesn't fork, which is the case here. Can we fork to SNAPSHOT_META here rather than when calling repoDataLoadDeduplicator::execute?
There was a problem hiding this comment.
++ I pushed below commit, thanks for catching this! Also simplifies the call-sites nicely :)
| repoData.getGenId() | ||
| ) | ||
| ) | ||
| repoDataLoadDeduplicator.execute( |
|
Thanks David! |
…-caching repository instance (#91851) (#91866) * Simplify and optimize deduplication of RepositoryData for a non-caching repository instance (#91851) This makes use of the new deduplicator infrastructure to move to more efficient deduplication mechanics. The existing solution hardly ever deduplicated because it would only deduplicate after the repository entered a consistent state. The adjusted solution is much simpler, in that it simply deduplicates such that only a single loading of `RepositoryData` will ever happen at a time, fixing memory issues from massively concurrent loading of the repo data as described in #89952. closes #89952 * fix compile
|
Would it be safe to backport this to 7.17 too? We definitely see #89952 affecting them, and I don't see any obvious reasons not to apply this there. |
|
Yea it should be fine :) let me try .. |
|
#92661 worked |
…-caching repository instance (#91851) (#91866) (#92661) * Simplify and optimize deduplication of RepositoryData for a non-caching repository instance (#91851) This makes use of the new deduplicator infrastructure to move to more efficient deduplication mechanics. The existing solution hardly ever deduplicated because it would only deduplicate after the repository entered a consistent state. The adjusted solution is much simpler, in that it simply deduplicates such that only a single loading of `RepositoryData` will ever happen at a time, fixing memory issues from massively concurrent loading of the repo data as described in #89952. closes #89952 * fix compile
This makes use of the new deduplicator infrastructure to move to more efficient deduplication mechanics.
The existing solution hardly ever deduplicated because it would only deduplicate after the repository entered a consistent state. The adjusted solution is much simpler, in that it simply deduplicates such that only a single loading of
RepositoryDatawill ever happen at a time, fixing memory issues from massively concurrent loading of the repo data as described in #89952.closes #89952 (this should be all that's needed to fix the memory issue in practice, judging by heap dumps as the problem really just is the initial set of snapshots going wild with concurrent loading of repo data on the huge snapshot-meta pool)