Skip to content

Improve SortedBinaryDocValues api to known about sparsity and per-document value mode.#139990

Merged
martijnvg merged 12 commits intoelastic:mainfrom
martijnvg:teach_SingleValueMatchQuery_about_new_mv_binary_dv_format
Jan 6, 2026
Merged

Improve SortedBinaryDocValues api to known about sparsity and per-document value mode.#139990
martijnvg merged 12 commits intoelastic:mainfrom
martijnvg:teach_SingleValueMatchQuery_about_new_mv_binary_dv_format

Conversation

@martijnvg
Copy link
Copy Markdown
Member

And teach SingleValueMatchQuery to use this to optimize in case a field is single valued.

…ument value mode.

And teach SingleValueMatchQuery to use this to optimize in case a field is single valued.
@martijnvg martijnvg added >non-issue :Analytics/Compute Engine Analytics in ES|QL :StorageEngine/Mapping The storage related side of mappings labels Dec 24, 2025
@martijnvg martijnvg marked this pull request as ready for review December 24, 2025 16:43
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine labels Dec 24, 2025
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

/**
* Indicates the sparsity of the values for this field.
*/
public Sparsity getSparsity() {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main change. Optionally SortedBinaryDocValues can now indicate the value mode and sparsity. The only classes that implement this are SeparateCounts (via doc value skipper on count field) and SingletonSortedBinaryDocValues (is by definition single valued).

Then SingleValueMatchQuery uses this new capability of SortedBinaryDocValues to be more optimal when value mode is single value and rewrites to match all docs if sparsity is dense and value mode is single value.

Copy link
Copy Markdown
Contributor

@parkertimmins parkertimmins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

public ValueMode getValueMode() {
long minValue = countsSkipper.minValue();
long maxValue = countsSkipper.maxValue();
if (minValue == 1 && maxValue == 1) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The minValue can never be 0, right? Since if counts are 0, there just won't be a value. So we could just check if maxValue==1 . Though perhaps it's clearer as is 🤔

}

public static MultiValuedSortedBinaryDocValues from(BinaryDocValues values, NumericDocValues counts) throws IOException {
public static MultiValuedSortedBinaryDocValues from(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be a bit easier to follow if it were from(BinaryDocValues values, LeafReader leafReader, String countsField), and we then calculate the Sparsity and ValueMode in this method and pass them to the SeparateCounts constructor. Holding on to the Skipper when we're only using it to get the maxValue doesn't quite feel right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done via 3e883a0

@martijnvg martijnvg requested a review from romseygeek January 6, 2026 10:56
Copy link
Copy Markdown
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@martijnvg martijnvg merged commit e1ab473 into elastic:main Jan 6, 2026
35 checks passed
sidosera pushed a commit to sidosera/elasticsearch that referenced this pull request Jan 7, 2026
…ument value mode. (elastic#139990)

And teach SingleValueMatchQuery to use this to optimize in case a field is single valued.
elasticsearchmachine pushed a commit that referenced this pull request Jan 12, 2026
The PR at #139990
introduced a snapshot-only feature, but the tests attempt to run this in
release builds. This fix makes sure the new feature is only run in
snapshot builds.

Fixes #140498
spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request Jan 21, 2026
The PR at elastic#139990
introduced a snapshot-only feature, but the tests attempt to run this in
release builds. This fix makes sure the new feature is only run in
snapshot builds.

Fixes elastic#140498
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/Compute Engine Analytics in ES|QL >non-issue :StorageEngine/Mapping The storage related side of mappings Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants