Deduplicate BucketOrder when deserializing#112707
Merged
iverase merged 5 commits intoelastic:mainfrom Sep 12, 2024
Merged
Conversation
Collaborator
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Collaborator
|
Hi @iverase, I've created a changelog YAML for you. |
nik9000
reviewed
Sep 10, 2024
| compoundOrder.add(Streams.readOrder(in)); | ||
| } | ||
| return new CompoundOrder(compoundOrder, false); | ||
| return bucketOrderDeduplicator.deduplicate(new CompoundOrder(compoundOrder, false)); |
Member
There was a problem hiding this comment.
ESQL uses a wrapper around the StreamInput that keeps the cache in a regular old variable rather than a static. I'd prefer that if we can manage it.
Contributor
Author
There was a problem hiding this comment.
I have moved it as a wrapper of StreamInput by (ab)using the fact that aggregations are deserialize using DelayableWritable. I have to introduce an interface so we can deduplicate when it is found.
nik9000
approved these changes
Sep 11, 2024
iverase
added a commit
to iverase/elasticsearch
that referenced
this pull request
Sep 12, 2024
Deduplicate BucketOrder object by wrapping the StreamInput generated by DelayableWritable objects.
Collaborator
💚 Backport successful
|
v1v
added a commit
to v1v/elasticsearch
that referenced
this pull request
Sep 12, 2024
…tion-ironbank-ubi * upstream/main: (302 commits) Deduplicate BucketOrder when deserializing (elastic#112707) Introduce test utils for ingest pipelines (elastic#112733) [Test] Account for auto-repairing for shard gen file (elastic#112778) Do not throw in task enqueued by CancellableRunner (elastic#112780) Mute org.elasticsearch.script.StatsSummaryTests testEqualsAndHashCode elastic#112439 Mute org.elasticsearch.repositories.blobstore.testkit.integrity.RepositoryVerifyIntegrityIT testTransportException elastic#112779 Use a dedicated test executor in MockTransportService (elastic#112748) Estimate segment field usages (elastic#112760) (Doc+) Inference Pipeline ignores Mapping Analyzers (elastic#112522) Fix verifyVersions task (elastic#112765) (Doc+) Terminating Exit Codes (elastic#112530) (Doc+) CAT Nodes default columns (elastic#112715) [DOCS] Augment installation warnings (elastic#112756) Mute org.elasticsearch.repositories.blobstore.testkit.integrity.RepositoryVerifyIntegrityIT testCorruption elastic#112769 Bump Elasticsearch to a minimum of JDK 21 (elastic#112252) ESQL: Compute support for filtering ungrouped aggs (elastic#112717) Bump Elasticsearch version to 9.0.0 (elastic#112570) add CDR related data streams to kibana_system priviliges (elastic#112655) Support widening of numeric types in union-types (elastic#112610) Introduce data stream options and failure store configuration classes (elastic#109515) ...
elasticsearchmachine
pushed a commit
that referenced
this pull request
Sep 12, 2024
davidkyle
pushed a commit
that referenced
this pull request
Sep 12, 2024
Deduplicate BucketOrder object by wrapping the StreamInput generated by DelayableWritable objects.
This was referenced Nov 6, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I was looking into a heap dump where we were having millions of instances of BucketOrder, all the same. This was due to a nested string terms and huge amount of buckets. I am wondering if we can use something similar to what we are doing with string to deduplicate BucketOrder instances. This is what this PR is doing so I am looking for feedback in what folks think.