Add synthetic_source support to aggregate_metric_double fields#88909
Add synthetic_source support to aggregate_metric_double fields#88909csoulios merged 14 commits intoelastic:mainfrom
synthetic_source support to aggregate_metric_double fields#88909Conversation
|
Hi @csoulios, I've created a changelog YAML for you. |
|
Pinging @elastic/es-search (Team:Search) |
| public Leaf leaf(LeafReader reader, int[] docIdsInLeaf) throws IOException { | ||
| Map<Metric, SortedNumericDocValues> metricDocValues = new EnumMap<>(Metric.class); | ||
| for (Metric m : metrics) { | ||
| SortedNumericDocValues dv = dv(reader, m); | ||
| if (dv != null) { | ||
| metricDocValues.put(m, dv); | ||
| } | ||
| } | ||
|
|
||
| if (metricDocValues.isEmpty()) { | ||
| return SourceLoader.SyntheticFieldLoader.NOTHING_LEAF; | ||
| } | ||
|
|
||
| return new AggregateMetricSyntheticFieldLoader.ImmediateLeaf(metricDocValues); | ||
| } |
There was a problem hiding this comment.
I could not find an easy way to leverage the NumericSyntheticFieldLoader class, so I had to rewrite the same implementation so that it uses one SortedNumericDocValues instance per metric.
I skipped the singleton optimization because having mutliple docs in leaf requires an array of values. This combined with multiple metrics per doc (a Map<Metric, SortedNumericDocValues> will make the code extremely complex. I can do it if we think this optimization is worth the complexity
| NumericDocValues single = reader.getNumericDocValues(fieldName); | ||
| if (single != null) { | ||
| return DocValues.singleton(single); | ||
| } | ||
| return null; |
There was a problem hiding this comment.
Since all metric subfields are stored using the SortedNumericDocValuesField, all doc values will be stored as SortedNumericDocValues. So, I am not sure reader.getNumericDocValues(fieldName) will ever return non-null values
There was a problem hiding this comment.
Try removing it! I believe if you store single values it'll flip to NumericDocValues anyway.
If you do have to keep loading from both of there it's probably worth making this method static and public in NumericFieldMapper or something like that.
| return; | ||
| } | ||
| b.startObject(simpleName); | ||
| for (Map.Entry<Metric, SortedNumericDocValues> entry : metricDocValues.entrySet()) { |
There was a problem hiding this comment.
metricDocValues is an EnumMap instance. This means that the order that sub fields will be returned in the source, will be the order in the Metric enum. Order ismin,max, sum, value_count. It is not alphabetical, but it will always be consistent. Does it make sense to change this so subfields are retuned in an alphabetical order?
There was a problem hiding this comment.
It's cool. Consistency is lovely for reading these docs. Alphabetical just gets the consistency in an easy way. No need to do it here.
nik9000
left a comment
There was a problem hiding this comment.
I left a few questions. I think this works fine though! I'm just poking around the edges, suggesting some maybe silly ideas. Either way, I think it's right.
|
|
||
| /** | ||
| * Return a delegate field type for the default metric sub-field | ||
| * |
There was a problem hiding this comment.
Does your formatter do this? I think it's more standard, but I don't tend to do it myself.
There was a problem hiding this comment.
If I undertand the question well, this has nothing to do with formatting the field in the output.
This has to do with the following:
- Querying the field: When running a
termor arangequery, the query is delegated to the subfield metric that has been marked asdefault_metricin the field mapping. - When retrieving the field using the fields api, the value of the
default_metricis returned (this will change when Support aggregate_double_metric fields in the Field API #88534 is merged)
There was a problem hiding this comment.
Sorry, no, my question was about the extra line inserted into the javadoc.
There was a problem hiding this comment.
Looks like it was me messing with IntelliJ formatting. Reverted it now.
| NumericDocValues single = reader.getNumericDocValues(fieldName); | ||
| if (single != null) { | ||
| return DocValues.singleton(single); | ||
| } | ||
| return null; |
There was a problem hiding this comment.
Try removing it! I believe if you store single values it'll flip to NumericDocValues anyway.
If you do have to keep loading from both of there it's probably worth making this method static and public in NumericFieldMapper or something like that.
| return; | ||
| } | ||
| b.startObject(simpleName); | ||
| for (Map.Entry<Metric, SortedNumericDocValues> entry : metricDocValues.entrySet()) { |
There was a problem hiding this comment.
It's cool. Consistency is lovely for reading these docs. Alphabetical just gets the consistency in an easy way. No need to do it here.
...in/java/org/elasticsearch/xpack/aggregatemetric/mapper/AggregateDoubleMetricFieldMapper.java
Outdated
Show resolved
Hide resolved
|
@elasticmachine run elasticsearch-ci/packaging-tests-unix-sample |
|
@elasticmachine update branch |
nik9000
left a comment
There was a problem hiding this comment.
I left a tiny thing about switching types on the tracked map. Otherwise LGTM.
|
|
||
| @Override | ||
| public boolean empty() { | ||
| return metricDocValues.isEmpty(); |
There was a problem hiding this comment.
This is never empty, right? You could maybe just return false here. And In the ctor you could assert metricDocValues.empty == false
| // it is enough to check for the first docValue. However, in the future | ||
| // we may relax the requirement of all metrics existing. In this case | ||
| // we should check the doc value for each metric separately | ||
| dvHasValue = new EnumMap<>(Metric.class); |
There was a problem hiding this comment.
Could you use an EnumSet instead? And maybe clear it instead of recreate it?
| dvHasValue.put(e.getKey(), leafHasValues); | ||
| } | ||
|
|
||
| return hasValues; |
There was a problem hiding this comment.
I think maybe dvHasValue.isEmpty() == false is enough here?
This PR implements
synthetic_sourcesupport to theaggregate_metric_doublefield typeRelates to #86603