Skip to content

Sort the values of a legacy histogram during downsampling#140771

Merged
elasticsearchmachine merged 4 commits intoelastic:mainfrom
gmarouli:fix-unordered-centroids
Jan 16, 2026
Merged

Sort the values of a legacy histogram during downsampling#140771
elasticsearchmachine merged 4 commits intoelastic:mainfrom
gmarouli:fix-unordered-centroids

Conversation

@gmarouli
Copy link
Copy Markdown
Contributor

When indexing legacy histograms in elasticsearch we require that their values are sorted.

When downsampling is using the aggregate method, it merges the histograms to a single one and indexes it in the target index. We noticed that some merging algorithms do not guarantee that the centroids will be sorted.

This PR adds code that sorts the centroids before they are indexed to the target index to avoid indexing failures.

Fixes #139382

@gmarouli gmarouli added >test Issues or PRs that are addressing/adding tests auto-backport Automatically create backport pull requests when merged :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data branch:9.3 labels Jan 15, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

}
sortedCentroids.sort(Centroid::compareTo);
double[] values = sortedCentroids.stream().mapToDouble(Centroid::mean).toArray();
long[] counts = sortedCentroids.stream().mapToLong(Centroid::count).toArray();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: consider replacing streams with a loop, iirc streams are inefficient and downsampling is cpu intensive.

@gmarouli gmarouli added >bug and removed >test Issues or PRs that are addressing/adding tests labels Jan 16, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @gmarouli, I've created a changelog YAML for you.

@gmarouli gmarouli added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jan 16, 2026
@elasticsearchmachine elasticsearchmachine merged commit 13fd56c into elastic:main Jan 16, 2026
35 checks passed
@gmarouli gmarouli deleted the fix-unordered-centroids branch January 16, 2026 12:15
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💚 Backport successful

Status Branch Result
9.3

gmarouli added a commit to gmarouli/elasticsearch that referenced this pull request Jan 16, 2026
…0771)

When indexing legacy histograms in elasticsearch we require that their
values are sorted. 

When downsampling is using the aggregate method, it merges the
histograms to a single one and indexes it in the target index. We
noticed that some merging algorithms do not guarantee that the centroids
will be sorted. 

This PR adds code that sorts the centroids before they are indexed to
the target index to avoid indexing failures.

Fixes elastic#139382
elasticsearchmachine pushed a commit that referenced this pull request Jan 16, 2026
…140829)

When indexing legacy histograms in elasticsearch we require that their
values are sorted. 

When downsampling is using the aggregate method, it merges the
histograms to a single one and indexes it in the target index. We
noticed that some merging algorithms do not guarantee that the centroids
will be sorted. 

This PR adds code that sorts the centroids before they are indexed to
the target index to avoid indexing failures.

Fixes #139382
spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request Jan 21, 2026
…0771)

When indexing legacy histograms in elasticsearch we require that their
values are sorted. 

When downsampling is using the aggregate method, it merges the
histograms to a single one and indexes it in the target index. We
noticed that some merging algorithms do not guarantee that the centroids
will be sorted. 

This PR adds code that sorts the centroids before they are indexed to
the target index to avoid indexing failures.

Fixes elastic#139382
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >bug :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data Team:StorageEngine v9.3.1 v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] DownsampleIT testAggregateMethod failing

3 participants