Skip to content

Fix optimization rules for narrow types: wrap i8 results to 8 bits.#5409

Merged
cfallin merged 3 commits intobytecodealliance:mainfrom
cfallin:egraph-issue-5405
Dec 9, 2022
Merged

Fix optimization rules for narrow types: wrap i8 results to 8 bits.#5409
cfallin merged 3 commits intobytecodealliance:mainfrom
cfallin:egraph-issue-5405

Conversation

@cfallin
Copy link
Copy Markdown
Member

@cfallin cfallin commented Dec 9, 2022

This fixes #5405.

In the egraph mid-end's optimization rules, we were rewriting e.g. imuls of two iconsts to an iconst of the result, but without masking off the high bits (beyond the result type's width). This was producing iconsts with set high bits beyond their types' width, which is not legal.

In addition, this PR adds some optimizations to the algebraic rules to recognize e.g. x == x (and all other integer comparison operators) and resolve to 1 or 0 as appropriate.

This fixes bytecodealliance#5405.

In the egraph mid-end's optimization rules, we were rewriting e.g. imuls
of two iconsts to an iconst of the result, but without masking off the
high bits (beyond the result type's width). This was producing iconsts
with set high bits beyond their types' width, which is not legal.

In addition, this PR adds some optimizations to the algebraic rules to
recognize e.g. `x == x` (and all other integer comparison operators) and
resolve to 1 or 0 as appropriate.
Copy link
Copy Markdown
Contributor

@jameysharp jameysharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the one comment, this looks right to me!

@bjorn3
Copy link
Copy Markdown
Contributor

bjorn3 commented Dec 9, 2022

This was producing iconsts with set high bits beyond their types' width, which is not legal.

I thought backends were supposed to mask it. I believe cg_clif sometimes sign extends immediates.

@cfallin
Copy link
Copy Markdown
Member Author

cfallin commented Dec 9, 2022

This was producing iconsts with set high bits beyond their types' width, which is not legal.

I thought backends were supposed to mask it. I believe cg_clif sometimes sign extends immediates.

Hmm, it seems we discussed this and filed #3059 a while ago, and haven't done any thinking about it since. I suspect that we aren't doing such masking everywhere we need to be, and requiring rules everywhere to do the masking (e.g., const-folding an icmp eq would require comparing only the relevant bits) seems cumbersome. I'd rather we have an invariant that values stay in range.

For now I guess we can keep the status quo, but we should think about this some more (on that issue probably)!

@github-actions github-actions bot added the cranelift Issues related to the Cranelift code generator label Dec 9, 2022
Copy link
Copy Markdown
Contributor

@jameysharp jameysharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@cfallin cfallin enabled auto-merge (squash) December 9, 2022 21:53
@jameysharp
Copy link
Copy Markdown
Contributor

BTW I'm curious if the new optimization rules have any impact on any benchmarks. Do these patterns show up in any CLIF we're generating from wasm, or do our current benchmarks compile to the same sequence of instructions either way? Not terribly important but if this is a significant win for some reason, that'd be nice to know.

@cfallin
Copy link
Copy Markdown
Member Author

cfallin commented Dec 9, 2022

That's a good question, re: the icmp folding; I didn't do tests on this. Honestly I'd be somewhat surprised if they did though, given that the benchmarks are optimized .wasms and the sorts of redundancies that show up in the translated CLIF are not usually things that const-folding applies to.

@cfallin
Copy link
Copy Markdown
Member Author

cfallin commented Dec 9, 2022

And, to follow up, a quick smoketest with bz2 shows no runtime impact with the additional icmp folding roles.

@cfallin cfallin merged commit 244dce9 into bytecodealliance:main Dec 9, 2022
@cfallin cfallin deleted the egraph-issue-5405 branch December 10, 2022 00:15
cfallin added a commit that referenced this pull request Jan 19, 2023
This PR follows up on #5382 and #5391, which rebuilt the egraph-based optimization framework to be more performant, by enabling it by default.

Based on performance results in #5382 (my measurements on SpiderMonkey and bjorn3's independent confirmation with cg_clif), it seems that this is reasonable to enable. Now that we have been fuzzing compiler configurations with egraph opts (#5388) for 6 weeks, having fixed a few fuzzbugs that came up (#5409, #5420, #5438) and subsequently received no further reports from OSS-Fuzz, I believe it is stable enough to rely on.

This PR enables `use_egraphs`, and also normalizes its meaning: previously it forced optimization (it basically meant "turn on the egraph optimization machinery"), now it runs egraph opts if the opt level indicates (it means "use egraphs to optimize if we are going to optimize"). The conditionals in the top-level pass driver are a little subtle, but will get simpler once we can remove the non-egraph path (which we plan to do eventually!).

Fixes #5181.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cranelift: Fuzz failure with egraphs on AArch64

3 participants