Add "did you mean" to ObjectParser by nik9000 · Pull Request #50938 · elastic/elasticsearch

nik9000 · 2020-01-13T20:25:13Z

Check it out:

$ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{
  "dac": {}
}'

{
  "error" : {
    "root_cause" : [
      {
        "type" : "x_content_parse_exception",
        "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?"
      }
    ],
    "type" : "x_content_parse_exception",
    "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?"
  },
  "status" : 400
}

The tricky thing about implementing this is that x-content doesn't
depend on Lucene. So this works by creating an extension point for the
error message using SPI. Elasticsearch's server module provides the
"spell checking" implementation.

Check it out: ``` $ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{ "dac": {} }' { "error" : { "root_cause" : [ { "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" } ], "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" }, "status" : 400 } ``` The tricky thing about implementing this is that x-content doesn't depend on Lucene. So this works by creating an extension point for the error message using SPI. Elasticsearch's server module provides the "spell checking" implementation.

elasticmachine · 2020-01-13T20:25:15Z

Pinging @elastic/es-core-infra (:Core/Infra/Core)

nik9000

I thought about passing the "did you mean" implementation in at ObjectParser construction time. I think that'd mostly work too, but it'd require touching every place we build an ObjectParser which doesn't seem right.

This way does have a funny side effect - when the serve is on the classpath you'll get "did you mean" whether or not the request comes from a client. This doesn't seem like a huge problem though.

nik9000 · 2020-01-13T20:26:12Z

libs/x-content/src/main/java/org/elasticsearch/common/xcontent/ObjectParser.java

-        void acceptUnknownField(String parserName, String field, XContentLocation location, XContentParser parser,
-                                Value value, Context context) throws IOException;
+        void acceptUnknownField(ObjectParser<Value, Context> objectParser, String field, XContentLocation location, XContentParser parser,
+                Value value, Context context) throws IOException;


I think passing ObjectParser here is ok because the interface is entirely private already. I could certainly be convinced otherwise though.

nik9000 · 2020-01-13T20:26:37Z

libs/x-content/src/test/java/org/elasticsearch/common/xcontent/ObjectParserTests.java

            XContentParser parser = createParser(JsonXContent.jsonXContent, "{\"not_supported_field\" : \"foo\"}");
            XContentParseException ex = expectThrows(XContentParseException.class, () -> objectParser.parse(parser, s, null));
-            assertEquals(ex.getMessage(), "[1:2] [the_parser] unknown field [not_supported_field], parser not found");
+            assertEquals(ex.getMessage(), "[1:2] [the_parser] unknown field [not_supported_field]");


I could preserve this bit of the message, but I don't think it was really helping anything.

nik9000 · 2020-01-13T20:27:07Z

rest-api-spec/src/main/resources/rest-api-spec/test/update/90_error.yml

+---
+'Misspelled fields get "did you mean"':
+  - do:
+      catch: /\[1:2\] \[UpdateRequest\] unknown field \[dac\] did you mean \[doc\]\?/


I wanted some end to end test and a surprising number of things don't use ObjectParser in the server.

This PR makes it even more compelling that we should migrate as much as possible away from hand-rolled parsing code - it might be worth a divide-and-rule effort like we did with the HLRC or Streamable->Writeable?

Yeah, I think so!

nik9000 · 2020-01-13T20:27:34Z

server/src/main/java/org/elasticsearch/common/xcontent/SuggestingErrorOnUnknown.java

+    @Override
+    public String errorMessage(String parserName, String unknownField, Iterable<String> candidates) {
+        String message = String.format(Locale.ROOT, "[%s] unknown field [%s]", parserName, unknownField);
+        // TODO it'd be nice to combine this with BaseRestHandler's implementation.


This seems like a problem for a follow up PR. I don't think it'd be hard, but a little fiddly.

nik9000 · 2020-01-13T20:56:56Z

@elasticmachine run elasticsearch-ci/2

nik9000 · 2020-01-13T21:45:23Z

Oh boy some tests failed. I guess I shouldn't be surprised.

romseygeek

This is awesome, @nik9000!

romseygeek · 2020-01-14T13:58:14Z

libs/x-content/src/main/java/org/elasticsearch/common/xcontent/ErrorOnUnknown.java

+ */
+public interface ErrorOnUnknown {
+    /**
+     * The implementation of this interface that was loaded form SPI.


nit: s/form/from/

romseygeek · 2020-01-14T14:01:01Z

rest-api-spec/src/main/resources/rest-api-spec/test/update/90_error.yml

+---
+'Misspelled fields get "did you mean"':
+  - do:
+      catch: /\[1:2\] \[UpdateRequest\] unknown field \[dac\] did you mean \[doc\]\?/


This PR makes it even more compelling that we should migrate as much as possible away from hand-rolled parsing code - it might be worth a divide-and-rule effort like we did with the HLRC or Streamable->Writeable?

nik9000 · 2020-01-14T15:42:17Z

Thanks @romseygeek !

Check it out: ``` $ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{ "dac": {} }' { "error" : { "root_cause" : [ { "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" } ], "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" }, "status" : 400 } ``` The tricky thing about implementing this is that x-content doesn't depend on Lucene. So this works by creating an extension point for the error message using SPI. Elasticsearch's server module provides the "spell checking" implementation.

Check it out: ``` $ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{ "dac": {} }' { "error" : { "root_cause" : [ { "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" } ], "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" }, "status" : 400 } ``` The tricky thing about implementing this is that x-content doesn't depend on Lucene. So this works by creating an extension point for the error message using SPI. Elasticsearch's server module provides the "spell checking" implementation. s

Now that we've backported elastic#50938 to 7.x it should be safe to run its test against BWC clusters that include that branch.

Now that we've backported #50938 to 7.x it should be safe to run its test against BWC clusters that include that branch.

When you declare an ObjectParser with top level named objects like we do with `significant_terms` we didn't support "did you mean". This fixes that. Relates elastic#50938

When you declare an ObjectParser with top level named objects like we do with `significant_terms` we didn't support "did you mean". This fixes that. Relates #50938

When you declare an ObjectParser with top level named objects like we do with `significant_terms` we didn't support "did you mean". This fixes that. Relates elastic#50938

When you declare an ObjectParser with top level named objects like we do with `significant_terms` we didn't support "did you mean". This fixes that. Relates #50938

Check it out: ``` $ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{ "dac": {} }' { "error" : { "root_cause" : [ { "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" } ], "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" }, "status" : 400 } ``` The tricky thing about implementing this is that x-content doesn't depend on Lucene. So this works by creating an extension point for the error message using SPI. Elasticsearch's server module provides the "spell checking" implementation.

Now that we've backported elastic#50938 to 7.x it should be safe to run its test against BWC clusters that include that branch.

When you declare an ObjectParser with top level named objects like we do with `significant_terms` we didn't support "did you mean". This fixes that. Relates elastic#50938

nik9000 added >enhancement :Core/Infra/Core Core issues without another label v8.0.0 v7.6.0 labels Jan 13, 2020

nik9000 requested review from rjernst and romseygeek January 13, 2020 20:25

nik9000 commented Jan 13, 2020

View reviewed changes

Checkstyle, you annoying, lovely thing

bf8ef05

Oh boy

5be1e7a

nik9000 added 3 commits January 13, 2020 17:04

Merge branch 'master' into object_parser_did_you_mean

bb36d91

Fixup tests

e97b199

Fix tests?

0a63a71

romseygeek approved these changes Jan 14, 2020

View reviewed changes

nik9000 added 2 commits January 14, 2020 09:14

Speeling

cc67258

Merge branch 'master' into object_parser_did_you_mean

477e9ef

nik9000 merged commit 5da5f44 into elastic:master Jan 14, 2020

nik9000 added the backport pending label Jan 14, 2020

nik9000 removed the backport pending label Jan 14, 2020

nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jan 14, 2020

Update skip after backport

009413d

Now that we've backported elastic#50938 to 7.x it should be safe to run its test against BWC clusters that include that branch.

nik9000 mentioned this pull request Jan 14, 2020

Update skip after backport #51015

Merged

nik9000 added a commit that referenced this pull request Jan 15, 2020

Update skip after backport (#51015)

70cee71

Now that we've backported #50938 to 7.x it should be safe to run its test against BWC clusters that include that branch.

nik9000 mentioned this pull request Jan 15, 2020

"did you mean" for ObjectParser with top named #51018

Merged

nik9000 mentioned this pull request Jan 17, 2020

"did you mean" for ObjectParser with top named (backport of #51018) #51165

Merged

SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020

Update skip after backport (elastic#51015)

4c824a9

Now that we've backported elastic#50938 to 7.x it should be safe to run its test against BWC clusters that include that branch.

This was referenced Feb 3, 2020

[meta] 7.6 release elastic/elasticsearch-net#4340

Closed

[meta] 7.6 release elastic/elasticsearch-net#4341

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Conversation

nik9000 commented Jan 13, 2020

Uh oh!

elasticmachine commented Jan 13, 2020

Uh oh!

nik9000 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nik9000 commented Jan 13, 2020

Uh oh!

nik9000 commented Jan 13, 2020

Uh oh!

romseygeek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nik9000 commented Jan 14, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants