Skip to content

[ESQL] Remove Named Expcted Types map from testing infrastructure #111213

Merged
elasticsearchmachine merged 7 commits intoelastic:mainfrom
not-napoleon:esql-remove-named-expected-types
Jul 24, 2024
Merged

[ESQL] Remove Named Expcted Types map from testing infrastructure #111213
elasticsearchmachine merged 7 commits intoelastic:mainfrom
not-napoleon:esql-remove-named-expected-types

Conversation

@not-napoleon
Copy link
Copy Markdown
Member

This removes the NAMED_EXPECTED_TYPES map from the testing infrastructure. It had become difficult to maintain, and pushed error message text farther way from the code actually testing it. This PR introduces some small functional interfaces to enable scalar function tests to have more fine grained control over their expected error messages.

We could do more here, but I think this is a good step in the right direction.

@not-napoleon not-napoleon added >test Issues or PRs that are addressing/adding tests :Analytics/ES|QL AKA ESQL v8.16.0 labels Jul 23, 2024
@not-napoleon not-napoleon requested a review from nik9000 July 23, 2024 21:24
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 23, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

}

@FunctionalInterface
protected interface PositionalErrorMessageSupplier {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably wants javadoc given the number of places we're using it.

boolean entirelyNullPreservesType,
List<TestCaseSupplier> suppliers
List<TestCaseSupplier> suppliers,
PositionalErrorMessageSupplier positionalErrorMessageSupplier
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for giving this a name.

(v, p) -> switch (p) {
case 0 -> "string";
case 1 -> "datetime";
default -> "";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this throw on unknown?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we get to the default if the code that figures out which position has the bad argument comes up with a number higher than the number of arguments. Like if it says "the fourth argument to + is bad". That should really never happen, and I didn't put a lot of thought into what to do if it did, I just made the switch expression happy.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to make them happy by throwing if I don't expect it. I suppose in this case it's not a big difference.

geoShape(cases, "mv_first", "MvFirst", DataType.GEO_SHAPE, (size, values) -> equalTo(values.findFirst().get()));
cartesianShape(cases, "mv_first", "MvFirst", DataType.CARTESIAN_SHAPE, (size, values) -> equalTo(values.findFirst().get()));
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases);
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases, (v, p) -> "");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this throw? Maybe say something like all types are valid or something.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, this will fail if a type ends up not being valid. The function will generate an error message with the string "representable" here, which will not match the test's expected empty string. Throwing from here doesn't really make that any clearer in my opinion? But I'm open to discuss.

unsignedLongs(cases, "mv_max", "MvMax", (size, values) -> equalTo(values.reduce(BigInteger::max).get()));
dateTimes(cases, "mv_max", "MvMax", (size, values) -> equalTo(values.max().getAsLong()));
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases);
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases, (v, p) -> "representableNonSpatial");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy.

o,
v,
t,
(l, p) -> "datetime, double, integer, ip, keyword, long, text, unsigned_long or version"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be better off returning a closure. But not sure.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I poked around at it a bit, I didn't think it looked all that much better. Please feel free to submit a follow up that cleans it up if you want.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤘

Copy link
Copy Markdown
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

However, I feel like we're missing the possibility to standardize the error messages here. The current solution is far from ideal, but somehow guides you into using an specific format. I wonder if we could go towards some more strict generation, like using the @Param data to generate it.

Just speaking aloud. The only part I don't "like" here is that we're decentralizing the messages, which be make more tedious to undo later, if we want to do that

unsignedLongs(cases, "mv_min", "MvMin", (size, values) -> equalTo(values.reduce(BigInteger::min).get()));
dateTimes(cases, "mv_min", "MvMin", (size, values) -> equalTo(values.min().getAsLong()));
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases);
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases, (v, p) -> "representableNonSpatial");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this message is wrong (?), shouldn't it be human-readable?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is wrong, but let's grab it in a follow up, I think.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it should. But again, this is just putting the string into the test, not changing the string that the function currently sends on type error. I do think it's worth spending some time to review our type errors and update them, but that's not the goal of this PR.

@nik9000
Copy link
Copy Markdown
Member

nik9000 commented Jul 24, 2024

However, I feel like we're missing the possibility to standardize the error messages here.

I think we can get that by making some "canned" error message suppliers or something like that. That way places that aren't standard can use this and that are can provide the closure with a named method call. But I think this is a step in the right direction.

@not-napoleon
Copy link
Copy Markdown
Member Author

decentralizing the messages, which be make more tedious to undo later, if we want to do that

Just to be clear, the code that's being removed was never involved in generating the actual error messages. It was just being used to guess the error message for the test to validate. The error messages were always generated within the type checking infrastructure, on each function. See, for example, MvMax#resolveFieldType

@not-napoleon not-napoleon added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jul 24, 2024
@elasticsearchmachine elasticsearchmachine merged commit c5be248 into elastic:main Jul 24, 2024
@not-napoleon not-napoleon deleted the esql-remove-named-expected-types branch July 24, 2024 16:39
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Jul 25, 2024
* main: (39 commits)
  Update README.asciidoc (elastic#111244)
  ESQL: INLINESTATS (elastic#109583)
  ESQL: Document a little of `DataType` (elastic#111250)
  Relax assertions in segment level field stats (elastic#111243)
  LogsDB data generator - support nested object field (elastic#111206)
  Validate `Authorization` header in Azure test fixture (elastic#111242)
  Fixing HistoryStoreTests.testPut() and testStoreWithHideSecrets() (elastic#111246)
  [ESQL] Remove Named Expcted Types map from testing infrastructure  (elastic#111213)
  Change visibility of createWriter to allow tests from a different package to override it (elastic#111234)
  [ES|QL] Remove EsqlDataTypes (elastic#111089)
  Mute org.elasticsearch.repositories.azure.AzureBlobContainerRetriesTests testReadNonexistentBlobThrowsNoSuchFileException elastic#111233
  Abstract codec lookup by name, to make CodecService extensible (elastic#111007)
  Add HTTPS support to `AzureHttpFixture` (elastic#111228)
  Unmuting tests related to free_context action being processed in ESSingleNodeTestCase (elastic#111224)
  Upgrade Azure SDK (elastic#111225)
  Collapse transport versions for 8.14.0 (elastic#111199)
  Make sure contender uses logs templates (elastic#111183)
  unmute HistogramPercentileAggregationTests.testBoxplotHistogram (elastic#111223)
  Refactor Quality Assurance test infrastructure (elastic#111195)
  Mute org.elasticsearch.xpack.restart.FullClusterRestartIT testDisableFieldNameField {cluster=UPGRADED} elastic#111222
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests v8.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants