LogsDB qa tests - add specific matcher for source#111568
Conversation
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
| boolean match(List<Object> actual, List<Object> expected); | ||
|
|
||
| class HalfFloatMatcher implements FieldSpecificMatcher { | ||
| public boolean match(List<Object> actual, List<Object> expected) { |
| if (currentField instanceof Map<?, ?> map) { | ||
| descend(pathToCurrentField, (Map<String, Object>) map, flattened); | ||
| } else { | ||
| flattened.putIfAbsent(pathToCurrentField, new ArrayList<>()); |
There was a problem hiding this comment.
nit: Can we use computeIfAbsent().add()
| this.fieldSpecificMatchers = Map.of("half_float", new FieldSpecificMatcher.HalfFloatMatcher()); | ||
| } | ||
|
|
||
| public MatchResult match() { |
| import java.util.stream.Collectors; | ||
|
|
||
| public interface FieldSpecificMatcher { | ||
| boolean match(List<Object> actual, List<Object> expected); |
There was a problem hiding this comment.
Can this method return a MatchResult instead of a boolean?
There was a problem hiding this comment.
I wanted to do that but to produce full message in case there is a mismatch you'll need to pass mapping and settings here which ends up being a mouthful.
There was a problem hiding this comment.
Actually it's not a big deal.
| return MatchResult.match(); | ||
| } | ||
|
|
||
| private Optional<MatchResult> matchWithFieldSpecificMatcher(String fieldName, List<Object> actualValues, List<Object> expectedValues) { |
There was a problem hiding this comment.
Can this method return the specific matcher instead of MatchResult
There was a problem hiding this comment.
There is no common interface between a field specific matcher and generic matcher. Maybe there should be but i don't immediately see the benefit.
| continue; | ||
| } | ||
|
|
||
| flattened.putIfAbsent(pathFromRoot, new HashMap<>()); |
There was a problem hiding this comment.
Nit: combine the two lines:
flattened.computeIfAbsent(pathFromRoot, new HashMap<>()).put(entry.getKey(), entry.getValue());
There was a problem hiding this comment.
Neat, didn't expect this from java.
| * @return | ||
| */ | ||
| public static Map<String, Map<String, Object>> normalizeMapping(Map<String, Object> map) { | ||
| var flattened = new HashMap<String, Map<String, Object>>(); |
There was a problem hiding this comment.
I'm somewhat puzzled here, why not just use HashMap<String, Object> to track just the leaf fields?
There was a problem hiding this comment.
I see, you do that for source checking while you also want to compare the mappings. Need to think a bit about it..
There was a problem hiding this comment.
This is a map from normalized field name (a.b.c) to a map of mapping parameters (like type) since there can be multiple.
| private Optional<MatchResult> matchWithFieldSpecificMatcher(String fieldName, List<Object> actualValues, List<Object> expectedValues) { | ||
| var actualFieldMapping = actualNormalizedMapping.get(fieldName); | ||
| if (actualFieldMapping == null) { | ||
| // Dynamic mapping, nothing to do |
There was a problem hiding this comment.
Check that expectedNormalizedMapping.get(fieldName) returns null too?
|
|
||
| // Synthetic source modifications: | ||
| // * null values are not present | ||
| // * duplicates are removed |
There was a problem hiding this comment.
We probably need to add an extension to randomization to inject duplicates..
In general, it'd be nice to have some values repeated randomly, esp for hostname.,
There was a problem hiding this comment.
That's a very good idea!
* upstream/main: (132 commits) Fix compile after several merges Update docs with new behavior on skip conditions (elastic#111640) Skip on any instance of node or version features being present (elastic#111268) Skip on any node capability being present (elastic#111585) [DOCS] Publishes Anthropic inference service docs. (elastic#111619) Introduce `ChunkedZipResponse` (elastic#109820) [Gradle] fix esql compile cacheability (elastic#111651) Mute org.elasticsearch.datastreams.logsdb.qa.StandardVersusLogsIndexModeChallengeRestIT testTermsQuery elastic#111666 Mute org.elasticsearch.datastreams.logsdb.qa.StandardVersusLogsIndexModeChallengeRestIT testMatchAllQuery elastic#111664 Mute org.elasticsearch.xpack.esql.analysis.VerifierTests testMatchCommand elastic#111661 Mute org.elasticsearch.xpack.esql.optimizer.LocalPhysicalPlanOptimizerTests testMatchCommandWithMultipleMatches {default} elastic#111660 Mute org.elasticsearch.xpack.esql.optimizer.LocalPhysicalPlanOptimizerTests testMatchCommand {default} elastic#111659 Mute org.elasticsearch.xpack.esql.optimizer.LocalPhysicalPlanOptimizerTests testMatchCommandWithWhereClause {default} elastic#111658 LogsDB qa tests - add specific matcher for source (elastic#111568) ESQL: Move `randomLiteral` (elastic#111647) [ESQL] Clean up UNSUPPORTED type blocks (elastic#111648) ESQL: Remove the `NESTED` DataType (elastic#111495) ESQL: Move more out of esql-core (elastic#111604) Improve MvPSeriesWeightedSum edge case and add more tests (elastic#111552) Add link to flood-stage watermark exception message (elastic#111315) ... # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java
This PR adds a new matcher that specifically handles source of documents. We need a special implementation due to complex structure of source document (multiple layers of documents) and differences between synthetic and stored source.
Such differences in theory could be expressed in generic terms (e.g. add a parameter to ignore nulls to list matcher) but i think this is simpler. Moreover as i discovered previously we need special matchers for some fields due to field-specific differences in synthetic source. To do that we would need a special code path anyway.