Move semicolon hack into lexer by nik9000 · Pull Request #18931 · elastic/elasticsearch

nik9000 · 2016-06-16T21:36:13Z

Perviously we used token level lookbehind in the parser. That worked,
but only if the parser didn't have any ambiguity at all. Since the
parser has ambiguity it didn't work everywhere. In particular it failed
when parsing blocks in lambdas like a -> {int b = a + 2; b * b}.

This moves the hack from the parser into the lexer. There we can use
token lookbehind (same trick) to insert semicolons into the token
stream. This works much better for antlr because antlr's prediction
code can work with real tokens.

Also, the lexer is simpler than the parser, so if there is a place
to introduce a hack, that is a better place.

nik9000 · 2016-06-16T21:37:17Z

@rmuir this fixes ;s inside of lambdas.

@jdconrad, this is for you to review. Antlr likes this way better because it doesn't have anything tricky in the parser grammar.

jdconrad · 2016-06-17T17:57:08Z

modules/lang-painless/src/main/antlr/PainlessParser.g4

Nit: please drop the semicolon down a line to conform to the style of the other multi-line rules.

jdconrad · 2016-06-17T18:02:18Z

Couple of minor comments, otherwise +1.

Perviously we used token level lookbehind in the parser. That worked, but only if the parser didn't have any ambiguity *at all*. Since the parser has ambiguity it didn't work everywhere. In particular it failed when parsing blocks in lambdas like `a -> {int b = a + 2; b * b}`. This moves the hack from the parser into the lexer. There we can use token lookbehind (same trick) to *insert* semicolons into the token stream. This works much better for antlr because antlr's prediction code can work with real tokens. Also, the lexer is simpler than the parser, so if there is a place to introduce a hack, that is a better place.

nik9000 · 2016-06-17T20:21:25Z

Thanks for reviewing @jdconrad ! I renamed the class, fixed the style and merged. We can rename the class again if anyone comes up with a better name.

nik9000 added review v5.0.0-alpha4 labels Jun 16, 2016

nik9000 assigned jdconrad Jun 16, 2016

clintongormley added the >enhancement label Jun 17, 2016

clintongormley changed the title ~~Painless: move semicolon hack into lexer~~ Move semicolon hack into lexer Jun 17, 2016

jdconrad reviewed Jun 17, 2016
View reviewed changes

nik9000 force-pushed the painless_parser_ambiguity branch from fb225ee to 1e16c22 Compare June 17, 2016 20:18

nik9000 merged commit 1e16c22 into elastic:master Jun 17, 2016

jdconrad mentioned this pull request Jun 17, 2016

Ongoing Painless Improvements #17992

Closed

18 tasks

clintongormley added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed :Plugin Lang Painless labels Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move semicolon hack into lexer#18931

Move semicolon hack into lexer#18931
nik9000 merged 1 commit intoelastic:masterfrom
nik9000:painless_parser_ambiguity

nik9000 commented Jun 16, 2016

Uh oh!

nik9000 commented Jun 16, 2016

Uh oh!

jdconrad Jun 17, 2016

Uh oh!

jdconrad commented Jun 17, 2016

Uh oh!

nik9000 commented Jun 17, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nik9000 commented Jun 16, 2016

Uh oh!

nik9000 commented Jun 16, 2016

Uh oh!

jdconrad Jun 17, 2016

Choose a reason for hiding this comment

Uh oh!

jdconrad commented Jun 17, 2016

Uh oh!

nik9000 commented Jun 17, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants