Add reusable HistogramValue object #49799
Conversation
an object per document
|
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
|
|
||
| /** reset the value for the histogram */ | ||
| void reset(BytesRef bytesRef) { | ||
| streamInput = new ByteBufferStreamInput(ByteBuffer.wrap(bytesRef.bytes, bytesRef.offset, bytesRef.length)); |
There was a problem hiding this comment.
There is little value in reusing the histogram if you still create new inputs here. You might want to have a look at ByteArrayDataInput#reset.
There was a problem hiding this comment.
Yes, I was a bit annoyed with that. Still now I am encoding doubles as longs and using ByteArrayDataInput for deserialising. I found a bit weird I am using different family of Input/Output classes to read / write. Is that ok / safe?
There was a problem hiding this comment.
Good question. I have a slight preference for updating the write logic to be symmetric, but could be convinced either way.
There was a problem hiding this comment.
I think this is a bit a catch 22.
The ByteArrayDataOutput needs to be provided an array before hand so you need to know the size of the serialise object before hand. ByteBufferStreamOutput abstract out all that complexity.
The ByteBufferStreamInput is not reusable, ByteArrayDataInput is.
Maybe I am missing something but it seems something is missing. I am seeing this pattern where we are using more complex binary doc values and it seems logic to have a strategy to be reusing those wrappers, wdyt?
There was a problem hiding this comment.
I was thinking of using ByteBuffersDataOutput.
jpountz
left a comment
There was a problem hiding this comment.
I left a minor comment, LGTM otherwise. Feel free to push without further reviews.
| streamOutput.writeVInt(count); | ||
| streamOutput.writeDouble(values.get(i)); | ||
| dataOutput.writeVInt(count); | ||
| dataOutput.writeLong(NumericUtils.doubleToSortableLong(values.get(i))); |
There was a problem hiding this comment.
Since we don't need ordering, let's just do Double.doubleToRawLongBits, which is cheaper.
| public boolean next() { | ||
| if (streamInput.eof() == false) { | ||
| count = streamInput.readVInt(); | ||
| value = NumericUtils.sortableLongToDouble(streamInput.readLong()); |
There was a problem hiding this comment.
and use Double.longBitsToDouble here.
Adds a reusable implementation of HistogramValue so we do not create an object per document.
In #49683 a new field mapper was introduced which supports percentile aggregations via binary doc values. Those are complex values that are interface to the user via HistogramValue interface.
This field mapper generates the doc values and it currently creates an object per doc value of type HistogramValue. This PR adds a new class InternalHistogramValue that implements HistogramValue which can be reused so we create one object per segment instead of per document.