fix: Define DecimalLiteral in DSL v2 using the DSL v1 rules#643
fix: Define DecimalLiteral in DSL v2 using the DSL v1 rules#643Xanewok merged 3 commits intoNomicFoundation:mainfrom
Conversation
Otherwise, the following rule:
```rust
// An integer (without a dot or a fraction) is enabled in all versions:
TokenDefinition(
scanner = TrailingContext(
scanner = Sequence([
Fragment(DecimalDigits),
Optional(Fragment(DecimalExponent))
]),
not_followed_by = Fragment(IdentifierStart)
)
),
```
was a catch-all rule that could successfully lex "1.2" as "1" in all versions.
Instead, this mimicks the following version from the DSL v1:
```rust
scanner DecimalLiteral = (
(
(
{ removed in "0.5.0" (DecimalDigits (("." (DecimalDigits ?) ) ?)) } |
{ introduced in "0.5.0" (DecimalDigits (("." DecimalDigits ) ?)) } |
('.' DecimalDigits)
)
(DecimalExponent ?)
) not followed by IdentifierStart
) ;
```
|
OmarTawfik
left a comment
There was a problem hiding this comment.
The idea behind splitting each Token into multiple TokenDefinitions is that each can have their own versioning (for post-processing), but they all should still be mutually exclusive/non-ambiguous. It also allows us in the future to build a single FSM for all language terminals.
Perhaps this bug was uncovered by #641 .. If I can suggest, how about we fix the not_followed_by parts of the existing/separate TokenDefinitions, so that they can stay mutually exclusive? for example:
not_followed_by = Choice([
Fragment(IdentifierStart),
Fragment(DecimalDigits),
Fragment(Dot)
])So that we minimize the parsing diff once migrate to DSL v2.
|
I've disambiguated the rules by using negative lookahead as you proposed; also used the same rules in DSL v1 so that we retain the same semantics when migrating to DSL v2. @OmarTawfik is this good now? |
OmarTawfik
left a comment
There was a problem hiding this comment.
Looks great! thank you!
Otherwise, the following rule:
was a catch-all rule that could successfully lex "1.2" as "1" in all versions.
Instead, this mimicks the following version from the DSL v1:
and fixed a case in https://github.com/Xanewok/slang/tree/codegen-use-dslv2 when using parts of the DSLv2 for the codegen but I didn't open the PR yet.
Ref #638