Skip to content

Move semicolon hack into lexer#18931

Merged
nik9000 merged 1 commit intoelastic:masterfrom
nik9000:painless_parser_ambiguity
Jun 17, 2016
Merged

Move semicolon hack into lexer#18931
nik9000 merged 1 commit intoelastic:masterfrom
nik9000:painless_parser_ambiguity

Conversation

@nik9000
Copy link
Copy Markdown
Member

@nik9000 nik9000 commented Jun 16, 2016

Perviously we used token level lookbehind in the parser. That worked,
but only if the parser didn't have any ambiguity at all. Since the
parser has ambiguity it didn't work everywhere. In particular it failed
when parsing blocks in lambdas like a -> {int b = a + 2; b * b}.

This moves the hack from the parser into the lexer. There we can use
token lookbehind (same trick) to insert semicolons into the token
stream. This works much better for antlr because antlr's prediction
code can work with real tokens.

Also, the lexer is simpler than the parser, so if there is a place
to introduce a hack, that is a better place.

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 16, 2016

@rmuir this fixes ;s inside of lambdas.

@jdconrad, this is for you to review. Antlr likes this way better because it doesn't have anything tricky in the parser grammar.

@clintongormley clintongormley changed the title Painless: move semicolon hack into lexer Move semicolon hack into lexer Jun 17, 2016
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please drop the semicolon down a line to conform to the style of the other multi-line rules.

@jdconrad
Copy link
Copy Markdown
Contributor

Couple of minor comments, otherwise +1.

Perviously we used token level lookbehind in the parser. That worked,
but only if the parser didn't have any ambiguity *at all*. Since the
parser has ambiguity it didn't work everywhere. In particular it failed
when parsing blocks in lambdas like `a -> {int b = a + 2; b * b}`.

This moves the hack from the parser into the lexer. There we can use
token lookbehind (same trick) to *insert* semicolons into the token
stream. This works much better for antlr because antlr's prediction
code can work with real tokens.

Also, the lexer is simpler than the parser, so if there is a place
to introduce a hack, that is a better place.
@nik9000 nik9000 force-pushed the painless_parser_ambiguity branch from fb225ee to 1e16c22 Compare June 17, 2016 20:18
@nik9000 nik9000 merged commit 1e16c22 into elastic:master Jun 17, 2016
@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 17, 2016

Thanks for reviewing @jdconrad ! I renamed the class, fixed the style and merged. We can rename the class again if anyone comes up with a better name.

@jdconrad jdconrad mentioned this pull request Jun 17, 2016
18 tasks
@clintongormley clintongormley added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed :Plugin Lang Painless labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement v5.0.0-alpha4

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants