Downsample exponential histograms#138561
Conversation
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
| * Creates a new instance with the OpenTelemetry SDK default bucket limit of {@link ExponentialHistogramMerger#MAX_HISTOGRAM_BUCKETS} | ||
| * @param circuitBreaker the circuit breaker to use to limit memory allocations | ||
| */ | ||
| public static ExponentialHistogramMerger create(ExponentialHistogramCircuitBreaker circuitBreaker) { |
There was a problem hiding this comment.
Great idea to encode this default here!
I'll take care in a follow up to also use this in the ES|QL code.
...rc/main/java/org/elasticsearch/xpack/downsample/ExponentialHistogramMetricFieldProducer.java
Outdated
Show resolved
Hide resolved
…/downsample/ExponentialHistogramMetricFieldProducer.java Co-authored-by: Jonas Kunz <j+github@kunzj.de>
| case GAUGE -> NumericMetricFieldProducer.createFieldProducerForGauge(name(), samplingMethod); | ||
| case COUNTER -> LastValueFieldProducer.createForMetric(name()); | ||
| case HISTOGRAM -> throw new IllegalArgumentException("Unsupported metric type [histogram] for downsampling, coming soon"); | ||
| case HISTOGRAM -> ExponentialHistogramMetricFieldProducer.createMetricProducerForExponentialHistogram( |
There was a problem hiding this comment.
How will we be differentiating between exp histogram, tdigest and legacy histogram?
There was a problem hiding this comment.
How will we be differentiating between exp histogram, tdigest and legacy histogram?
Before I answer this, I would like to double check if I understand correctly. We have two field types for histograms:
exponential_histogramwhich can only be an exponential histogram, andhistogramwhich supports tdigest and hdr,
Is that correct? If yes, we can continue:
We can distinguish an exponential histogram from a legacy histogram from the field type. So, a follow up PR would take the field type into account exponential_histogram vs histogram.
In the first phase of legacy histograms, we will only support tdigest as a metric. So, we have the following situation:
time_series_metric: histogram&type: exponential_histogrammeans merging to an exponential histogram.time_series_metric: histogram&type: histogrammeans merging to a tdigest legacy histogram.- if the histogram is not a metric then it's downsampled with last value which does not need to know the algorithm.
If in the future we decide to support the other merging algorithm of legacy histograms we will need a way to distinguish between the two legacy histograms. It could be an index setting or a mapping setting, we would need to think what is a better way.
Does this answer your question?
There was a problem hiding this comment.
Sounds good, so we'll combine this with type information. Can we add the check in the current version, to restrict this only when time_series_metric: histogram & type: exponential_histogram and use last value otherwise?
Fwiw your understanding is correct - a slight correction is that histogram is not tdigest or hdr but whatever the user provided. We can just process it with these algorithms.
There was a problem hiding this comment.
Sounds good, so we'll combine this with type information. Can we add the check in the current version, to restrict this only when time_series_metric: histogram & type: exponential_histogram and use last value otherwise?
A time_series_metric: histogram only supports exponential histograms. If a user tries to set up the following combination time_series_metric: histogram and type: histogram in their mapping they will get an error. I can add time_series_metric: histogram & type: exponential_histogram to showcase how it's going to be, but the second condition will always be true.
Fwiw your understanding is correct - a slight correction is that histogram is not tdigest or hdr but whatever the user provided. We can just process it with these algorithms.
Clear! Thanks.
|
CI passed but it looks like github did not pick it up. I will update with main just to be sure. |
#139033) In #138561 we added exponential histograms and currently we are working on adding legacy histograms too. Every time we add a new doc value reader we need to add new types of field producer and doc value reader pairs. This PR makes that easier by making `AbstractDownsampleFieldProducer` generic. This is extracted from the work to downsample legacy histograms to facilitate reviewing.
elastic#139033) In elastic#138561 we added exponential histograms and currently we are working on adding legacy histograms too. Every time we add a new doc value reader we need to add new types of field producer and doc value reader pairs. This PR makes that easier by making `AbstractDownsampleFieldProducer` generic. This is extracted from the work to downsample legacy histograms to facilitate reviewing.
This PR implements downsampling for exponential histograms both as labels and as metrics. Marked as a non-issue because exponential histograms are behind a feature flag.
Closes #136259