Implement support for contextual keywords#598
Closed
Xanewok wants to merge 16 commits intoNomicFoundation:mainfrom
Closed
Implement support for contextual keywords#598Xanewok wants to merge 16 commits intoNomicFoundation:mainfrom
Xanewok wants to merge 16 commits intoNomicFoundation:mainfrom
Conversation
🦋 Changeset detectedLatest commit: d14285e The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
I knew I forgot about something important...
It was introduced alongside the `revert` contextual keyword, see argotorg/solidity#11037.
Contributor
AntonyBlakey
left a comment
There was a problem hiding this comment.
As a hack, this works, but I don't like the way it confuses the model re keywords in general, and my discomfort is telling me we should do it properly. Will discuss f2f.
Merged
github-merge-queue bot
pushed a commit
that referenced
this pull request
Sep 22, 2023
Non-controversial bits separated from #598. This updates versions for the `revert` keyword (and the associated statement, introduced in 0.8.4) and the `global` (introduced in 0.8.13 for the using directive) contextual keywords in both the new DSL and our old YAML spec. --------- Co-authored-by: Omar Tawfik <15987992+OmarTawfik@users.noreply.github.com>
Contributor
Author
|
We decided to hold this off until we migrate to a DSL v2, as that will change how keywords will be defined and identifier<>keyword will interact. |
1 task
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jan 8, 2024
Closes #568 There is still one outstanding issue where we return a `Vec<TokenKind>` from `next_token`; it'd like to return a more specialized type and ideally pass it on stack (2x2 bytes), rather than on-heap (extra 3x8 bytes for the Vec handle + indirection). We should name it better and properly show that we can return at most 2 token kinds (single token kind or identifier + kw combo). To do: - [x] Return tokens from `next_token` via stack Apart from that, I think this is a more correct approach than #598, especially accounting for the new keyword definition format in DSL v2. The main change is that we only check the keyword trie and additionally the (newly introduced) compound keyword scanners only after the token has been lexed as an identifier. For each context, we collect Identifier scanners used by the keywords and attempt promotion there. The existing lexing performance is not impacted from what I've seen when running the sanctuary tests and I can verify (incl. CST tests) that we now properly parse source that uses contextual keywords (e.g. `from`) and that the compound keywords (e.g. `ufixedMxN`) are properly versioned. This adapts the existing `codegen_grammar` interface that's a leftover from DSLv1; I did that to work on finishing #638; once this is merged and we now properly parse contextual keywords, I'll move to clean it up and reduce the parser codegen indirection (right now we go from v2 -> v1 model -> code generator -> Tera templates; it'd like to at least cut out the v1 model and/or simplify visiting v2 from the existing `CodeGenerator`). Please excuse the WIP comments in the middle; the first and the last ones should make sense when reviewing. I can simplify this a bit for review, if needed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #568.
First few commits are small cleanups to the PG code, the last one implements the support along with a new CST test.
This introduces a new
ScannerDefinitionNode::ContextualKeywordthat includes the literal and the underlying parser.The reason why the identifier parser is used is to be able to reference its token kind in the parser generator, as we don't intentionally specify any built-in Identifier parsers (well, we do hardcode
Identifierin one place but it's a hack I'd rather not spread. Do we need to handleYulIdentifieras well/separately?).For the version ranges that it's not a contextual keyword, it's still included in the trie as usual, so the old machinery works as expected and the lexer prioritises that match over a catch-all Identifier we use now.
However, for the version ranges that it is a contextual keyword, the version check is moved outside to a new
Lexer::as_contextual_keyword, which attempts to promote a given token to a contextual keyword and eats that if it's the expected token kind.This way, we can eat
TokenKind::SomeContextualKeywordif necessary, sinceparse_tokenattempts to promote a keyword, but for a given version where it's contextual, it's not returned early from the lexer, so we lex an Identifier by default.