Skip to content

LogsDB qa tests - add specific matcher for source#111568

Merged
lkts merged 2 commits intoelastic:mainfrom
lkts:logsdb_qa_tests_matcher_updates
Aug 6, 2024
Merged

LogsDB qa tests - add specific matcher for source#111568
lkts merged 2 commits intoelastic:mainfrom
lkts:logsdb_qa_tests_matcher_updates

Conversation

@lkts
Copy link
Copy Markdown
Contributor

@lkts lkts commented Aug 2, 2024

This PR adds a new matcher that specifically handles source of documents. We need a special implementation due to complex structure of source document (multiple layers of documents) and differences between synthetic and stored source.

Such differences in theory could be expressed in generic terms (e.g. add a parameter to ignore nulls to list matcher) but i think this is simpler. Moreover as i discovered previously we need special matchers for some fields due to field-specific differences in synthetic source. To do that we would need a special code path anyway.

@lkts lkts added >test Issues or PRs that are addressing/adding tests :StorageEngine/Logs You know, for Logs labels Aug 2, 2024
@lkts lkts requested review from dnhatn and kkrik-es August 2, 2024 21:41
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Copy link
Copy Markdown
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks @lkts.

boolean match(List<Object> actual, List<Object> expected);

class HalfFloatMatcher implements FieldSpecificMatcher {
public boolean match(List<Object> actual, List<Object> expected) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Override?

if (currentField instanceof Map<?, ?> map) {
descend(pathToCurrentField, (Map<String, Object>) map, flattened);
} else {
flattened.putIfAbsent(pathToCurrentField, new ArrayList<>());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we use computeIfAbsent().add()

this.fieldSpecificMatchers = Map.of("half_float", new FieldSpecificMatcher.HalfFloatMatcher());
}

public MatchResult match() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: override

import java.util.stream.Collectors;

public interface FieldSpecificMatcher {
boolean match(List<Object> actual, List<Object> expected);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this method return a MatchResult instead of a boolean?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to do that but to produce full message in case there is a mismatch you'll need to pass mapping and settings here which ends up being a mouthful.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's not a big deal.

return MatchResult.match();
}

private Optional<MatchResult> matchWithFieldSpecificMatcher(String fieldName, List<Object> actualValues, List<Object> expectedValues) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this method return the specific matcher instead of MatchResult

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no common interface between a field specific matcher and generic matcher. Maybe there should be but i don't immediately see the benefit.

continue;
}

flattened.putIfAbsent(pathFromRoot, new HashMap<>());
Copy link
Copy Markdown
Member

@kkrik-es kkrik-es Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: combine the two lines:

 flattened.computeIfAbsent(pathFromRoot, new HashMap<>()).put(entry.getKey(), entry.getValue());

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, didn't expect this from java.

* @return
*/
public static Map<String, Map<String, Object>> normalizeMapping(Map<String, Object> map) {
var flattened = new HashMap<String, Map<String, Object>>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat puzzled here, why not just use HashMap<String, Object> to track just the leaf fields?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, you do that for source checking while you also want to compare the mappings. Need to think a bit about it..

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a map from normalized field name (a.b.c) to a map of mapping parameters (like type) since there can be multiple.

private Optional<MatchResult> matchWithFieldSpecificMatcher(String fieldName, List<Object> actualValues, List<Object> expectedValues) {
var actualFieldMapping = actualNormalizedMapping.get(fieldName);
if (actualFieldMapping == null) {
// Dynamic mapping, nothing to do
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check that expectedNormalizedMapping.get(fieldName) returns null too?


// Synthetic source modifications:
// * null values are not present
// * duplicates are removed
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to add an extension to randomization to inject duplicates..

In general, it'd be nice to have some values repeated randomly, esp for hostname.,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good idea!

@lkts lkts merged commit aa1d2bc into elastic:main Aug 6, 2024
@lkts lkts deleted the logsdb_qa_tests_matcher_updates branch August 6, 2024 20:49
@lkts lkts mentioned this pull request Aug 6, 2024
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Aug 7, 2024
* upstream/main: (132 commits)
  Fix compile after several merges
  Update docs with new behavior on skip conditions (elastic#111640)
  Skip on any instance of node or version features being present (elastic#111268)
  Skip on any node capability being present (elastic#111585)
  [DOCS] Publishes Anthropic inference service docs. (elastic#111619)
  Introduce `ChunkedZipResponse` (elastic#109820)
  [Gradle] fix esql compile cacheability (elastic#111651)
  Mute org.elasticsearch.datastreams.logsdb.qa.StandardVersusLogsIndexModeChallengeRestIT testTermsQuery elastic#111666
  Mute org.elasticsearch.datastreams.logsdb.qa.StandardVersusLogsIndexModeChallengeRestIT testMatchAllQuery elastic#111664
  Mute org.elasticsearch.xpack.esql.analysis.VerifierTests testMatchCommand elastic#111661
  Mute org.elasticsearch.xpack.esql.optimizer.LocalPhysicalPlanOptimizerTests testMatchCommandWithMultipleMatches {default} elastic#111660
  Mute org.elasticsearch.xpack.esql.optimizer.LocalPhysicalPlanOptimizerTests testMatchCommand {default} elastic#111659
  Mute org.elasticsearch.xpack.esql.optimizer.LocalPhysicalPlanOptimizerTests testMatchCommandWithWhereClause {default} elastic#111658
  LogsDB qa tests - add specific matcher for source (elastic#111568)
  ESQL: Move `randomLiteral` (elastic#111647)
  [ESQL] Clean up UNSUPPORTED type blocks (elastic#111648)
  ESQL: Remove the `NESTED` DataType (elastic#111495)
  ESQL: Move more out of esql-core (elastic#111604)
  Improve MvPSeriesWeightedSum edge case and add more tests (elastic#111552)
  Add link to flood-stage watermark exception message (elastic#111315)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
rjernst pushed a commit to rjernst/elasticsearch that referenced this pull request Aug 7, 2024
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:StorageEngine/Logs You know, for Logs Team:StorageEngine >test Issues or PRs that are addressing/adding tests v8.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants