Improve SortedBinaryDocValues api to known about sparsity and per-document value mode.#139990
Conversation
…ument value mode. And teach SingleValueMatchQuery to use this to optimize in case a field is single valued.
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
| /** | ||
| * Indicates the sparsity of the values for this field. | ||
| */ | ||
| public Sparsity getSparsity() { |
There was a problem hiding this comment.
This is the main change. Optionally SortedBinaryDocValues can now indicate the value mode and sparsity. The only classes that implement this are SeparateCounts (via doc value skipper on count field) and SingletonSortedBinaryDocValues (is by definition single valued).
Then SingleValueMatchQuery uses this new capability of SortedBinaryDocValues to be more optimal when value mode is single value and rewrites to match all docs if sparsity is dense and value mode is single value.
| public ValueMode getValueMode() { | ||
| long minValue = countsSkipper.minValue(); | ||
| long maxValue = countsSkipper.maxValue(); | ||
| if (minValue == 1 && maxValue == 1) { |
There was a problem hiding this comment.
The minValue can never be 0, right? Since if counts are 0, there just won't be a value. So we could just check if maxValue==1 . Though perhaps it's clearer as is 🤔
…ry_about_new_mv_binary_dv_format
…ry_about_new_mv_binary_dv_format
| } | ||
|
|
||
| public static MultiValuedSortedBinaryDocValues from(BinaryDocValues values, NumericDocValues counts) throws IOException { | ||
| public static MultiValuedSortedBinaryDocValues from( |
There was a problem hiding this comment.
I think this would be a bit easier to follow if it were from(BinaryDocValues values, LeafReader leafReader, String countsField), and we then calculate the Sparsity and ValueMode in this method and pass them to the SeparateCounts constructor. Holding on to the Skipper when we're only using it to get the maxValue doesn't quite feel right?
…ry_about_new_mv_binary_dv_format
…ry_about_new_mv_binary_dv_format
…ument value mode. (elastic#139990) And teach SingleValueMatchQuery to use this to optimize in case a field is single valued.
The PR at elastic#139990 introduced a snapshot-only feature, but the tests attempt to run this in release builds. This fix makes sure the new feature is only run in snapshot builds. Fixes elastic#140498
And teach SingleValueMatchQuery to use this to optimize in case a field is single valued.