add percentile support to Transform
Related: support for percentile ranks, stats, extended_stats, see separate issues
Because percentile output multiple values the transform will output the result as a nested object while the root name is configurable but the inner names are derived from the configuration. For percentiles with decimal places (99.9), the field will replace the . with an _ to not collide with nested objects. Renaming can be done using a ingest pipeline, e.g. to rename my_percentile.50 to my_median.
example configuration:
"my_percentile": {
"percentiles": {
"field": "bytes",
"percents": [
10,
50,
99.9
]
}
}
example output:
"my_percentile" : {
"99_9" : 9875.0,
"50" : 5673.5,
"10" : 604.6000000000001
}
Values are default mapped to double.
Alternative Histogram
7.6 added a histogram datatype. Storing histograms and calculating percentiles on top of it has various advantages.
Transform should support the histogram agg and write the result into a histogram data type.
This alternative should be implemented in addition, especially for large cases, storing histograms allows updating without full re-processing.
add percentile support to Transform
Related: support for percentile ranks, stats, extended_stats, see separate issues
Because percentile output multiple values the transform will output the result as a nested object while the root name is configurable but the inner names are derived from the configuration. For percentiles with decimal places (
99.9), the field will replace the.with an_to not collide with nested objects. Renaming can be done using a ingest pipeline, e.g. to renamemy_percentile.50tomy_median.example configuration:
example output:
Values are default mapped to
double.Alternative Histogram
7.6added ahistogramdatatype. Storing histograms and calculating percentiles on top of it has various advantages.Transform should support the histogram agg and write the result into a histogram data type.
This alternative should be implemented in addition, especially for large cases, storing histograms allows updating without full re-processing.