Skip to content

fix: Define DecimalLiteral in DSL v2 using the DSL v1 rules#643

Merged
Xanewok merged 3 commits intoNomicFoundation:mainfrom
Xanewok:fix-dslv2-decimal-literal
Nov 9, 2023
Merged

fix: Define DecimalLiteral in DSL v2 using the DSL v1 rules#643
Xanewok merged 3 commits intoNomicFoundation:mainfrom
Xanewok:fix-dslv2-decimal-literal

Conversation

@Xanewok
Copy link
Copy Markdown
Contributor

@Xanewok Xanewok commented Nov 7, 2023

Otherwise, the following rule:

// An integer (without a dot or a fraction) is enabled in all versions:
TokenDefinition(
    scanner = TrailingContext(
        scanner = Sequence([
            Fragment(DecimalDigits),
            Optional(Fragment(DecimalExponent))
        ]),
        not_followed_by = Fragment(IdentifierStart)
    )
),

was a catch-all rule that could successfully lex "1.2" as "1" in all versions.

Instead, this mimicks the following version from the DSL v1:

scanner DecimalLiteral = (
        (
            (
                { removed in "0.5.0"    (DecimalDigits (("." (DecimalDigits ?) ) ?)) } |
                { introduced in "0.5.0" (DecimalDigits (("." DecimalDigits     ) ?)) } |
                ('.' DecimalDigits)
            )
            (DecimalExponent ?)
        ) not followed by IdentifierStart
    ) ;

and fixed a case in https://github.com/Xanewok/slang/tree/codegen-use-dslv2 when using parts of the DSLv2 for the codegen but I didn't open the PR yet.

Ref #638

Otherwise, the following rule:

```rust
// An integer (without a dot or a fraction) is enabled in all versions:
TokenDefinition(
    scanner = TrailingContext(
        scanner = Sequence([
            Fragment(DecimalDigits),
            Optional(Fragment(DecimalExponent))
        ]),
        not_followed_by = Fragment(IdentifierStart)
    )
),
```

was a catch-all rule that could successfully lex "1.2" as "1" in all versions.

Instead, this mimicks the following version from the DSL v1:
```rust
scanner DecimalLiteral = (
        (
            (
                { removed in "0.5.0"    (DecimalDigits (("." (DecimalDigits ?) ) ?)) } |
                { introduced in "0.5.0" (DecimalDigits (("." DecimalDigits     ) ?)) } |
                ('.' DecimalDigits)
            )
            (DecimalExponent ?)
        ) not followed by IdentifierStart
    ) ;
```
@Xanewok Xanewok requested a review from a team as a code owner November 7, 2023 09:32
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Nov 7, 2023

⚠️ No Changeset found

Latest commit: cabc598

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Copy Markdown
Contributor

@OmarTawfik OmarTawfik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind splitting each Token into multiple TokenDefinitions is that each can have their own versioning (for post-processing), but they all should still be mutually exclusive/non-ambiguous. It also allows us in the future to build a single FSM for all language terminals.

Perhaps this bug was uncovered by #641 .. If I can suggest, how about we fix the not_followed_by parts of the existing/separate TokenDefinitions, so that they can stay mutually exclusive? for example:

not_followed_by = Choice([
    Fragment(IdentifierStart),
    Fragment(DecimalDigits),
    Fragment(Dot)
])

@Xanewok
Copy link
Copy Markdown
Contributor Author

Xanewok commented Nov 8, 2023

I've disambiguated the rules by using negative lookahead as you proposed; also used the same rules in DSL v1 so that we retain the same semantics when migrating to DSL v2.

@OmarTawfik is this good now?

Copy link
Copy Markdown
Contributor

@OmarTawfik OmarTawfik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! thank you!

@Xanewok Xanewok added this pull request to the merge queue Nov 9, 2023
Merged via the queue into NomicFoundation:main with commit f0cb998 Nov 9, 2023
@Xanewok Xanewok deleted the fix-dslv2-decimal-literal branch November 9, 2023 08:32
@Xanewok Xanewok mentioned this pull request Nov 13, 2023
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants