Skip to content

Conversation

@cfallin
Copy link
Member

@cfallin cfallin commented Jul 29, 2022

The ISLE language's lexer previously used a very primitive
i64::from_str_radix call to parse integer constants, allowing values
in the range -2^63..2^63 only. Also, underscores to separate digits (as
is allwoed in Rust) were not supported. Finally, 128-bit constants were
not supported at all.

This PR addresses all issues above:

  • Integer constants are internally stored as 128-bit values.
  • Parsing supports either signed (-2^127..2^127) or unsigned (0..2^128)
    range. Negation works independently of that, so one can write
    -0xffff..ffff (128 bits wide, i.e., -(2^128-1)) to get a 1.
  • Underscores are supported to separate groups of digits, so one can
    write 0xffff_ffff.
  • A minor oversight was fixed: hex constants can start with 0X
    (uppercase) as well as 0x, for consistency with Rust and C.

This PR also adds a new kind of ISLE test that actually runs a driver
linked to compiled ISLE code; we previously didn't have any such tests,
but it is now quite useful to assert correct interpretation of constant
values.

@cfallin cfallin requested review from akirilov-arm and fitzgen July 29, 2022 19:11
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me, but I have a mild amount of hesitation with this. IIRC this is how the Rust AST used to look like (more or less) and eventually it changed to a much different representation where I think everything is stored as u128 and the negation bit is stored separately. The reason and rationale for that change though may be around providing warnings about "this literal is too big for this type" than anything else perhaps. I just know that handling literal integers in an AST is surprisingly tricky...

In any case this is better than before and seems like it should work just fine!

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language labels Jul 29, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

Details This issue or pull request has been labeled: "cranelift", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

The ISLE language's lexer previously used a very primitive
`i64::from_str_radix` call to parse integer constants, allowing values
in the range -2^63..2^63 only. Also, underscores to separate digits (as
is allwoed in Rust) were not supported. Finally, 128-bit constants were
not supported at all.

This PR addresses all issues above:
- Integer constants are internally stored as 128-bit values.
- Parsing supports either signed (-2^127..2^127) or unsigned (0..2^128)
  range. Negation works independently of that, so one can write
  `-0xffff..ffff` (128 bits wide, i.e., -(2^128-1)) to get a `1`.
- Underscores are supported to separate groups of digits, so one can
  write `0xffff_ffff`.
- A minor oversight was fixed: hex constants can start with `0X`
  (uppercase) as well as `0x`, for consistency with Rust and C.

This PR also adds a new kind of ISLE test that actually runs a driver
linked to compiled ISLE code; we previously didn't have any such tests,
but it is now quite useful to assert correct interpretation of constant
values.
@cfallin cfallin force-pushed the isle-int-constants branch from da6be17 to bf5ddef Compare July 29, 2022 21:15
@cfallin
Copy link
Member Author

cfallin commented Jul 29, 2022

Indeed, there's more we could/should do if we want to properly warn about overflows. ISLE's type system isn't quite sophisticated enough to do that at the moment though, as it doesn't actually know about the integral types at a built-in level (mostly); so for now the model is "we hold 128 bits and we pass them through". And the codegen does an explicit 1234i128 as <ty> where <ty> is the user-defined primitive type. So we at least won't lose any bits that the defined primitive types would have held. So for now, indeed, it's an improvement!

@cfallin
Copy link
Member Author

cfallin commented Jul 29, 2022

(Pushed a few formatting changes, no substantial diff.)

@cfallin cfallin enabled auto-merge (squash) July 29, 2022 21:17
@cfallin cfallin merged commit 8e9e9c5 into bytecodealliance:main Jul 29, 2022
@cfallin cfallin deleted the isle-int-constants branch July 29, 2022 22:52
cfallin added a commit to cfallin/wasmtime that referenced this pull request Aug 9, 2022
The ISLE language's lexer previously used a very primitive
`i64::from_str_radix` call to parse integer constants, allowing values
in the range -2^63..2^63 only. Also, underscores to separate digits (as
is allwoed in Rust) were not supported. Finally, 128-bit constants were
not supported at all.

This PR addresses all issues above:
- Integer constants are internally stored as 128-bit values.
- Parsing supports either signed (-2^127..2^127) or unsigned (0..2^128)
  range. Negation works independently of that, so one can write
  `-0xffff..ffff` (128 bits wide, i.e., -(2^128-1)) to get a `1`.
- Underscores are supported to separate groups of digits, so one can
  write `0xffff_ffff`.
- A minor oversight was fixed: hex constants can start with `0X`
  (uppercase) as well as `0x`, for consistency with Rust and C.

This PR also adds a new kind of ISLE test that actually runs a driver
linked to compiled ISLE code; we previously didn't have any such tests,
but it is now quite useful to assert correct interpretation of constant
values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants