codegen: Unify printing of and add some more internal IR validity errors#58429
Merged
codegen: Unify printing of and add some more internal IR validity errors#58429
Conversation
What should codegen do when it detects invalid IR?
There are a few reasonable options. One is to just assert -
this is relatively straightforward but also not very friendly
because it immediately takes down your session, so you can't
inspect what values may have caused the issue. Additionally,
often allow invalid IR in dead code (because otherwise transformation
passes would have to check whether the transformation that they're
doing makes some code dead, which can be expensive or not necessarily
visible).
Thus we generally try to keep codegen going, trying to bail back to
valid code as quickly as possible. In a few places, we additionally
insert an unmodeled `emit_error("Internal Error")`. These are a bit
weird because the earlier stages of the compiler do not know that
they can exist, but in practice they mostly work fine (although
they can cause additional crashes if there are try/catches in the
way). If we don't, then we generally just stop codegening, so if
execution ever were to reach there, it'll just run into whatever
code comes after, likely crashing fairly soon.
Because of this consideration, the `emit_error` is generally preferred.
However, the tradeoff is that it increases the size of the code and LLVM
can no longer optimize under the assumption that a particular code
branch doesn't happen. We thus need to be judicious in where we use it.
The general guidance is that it's fine to use in situations where the
IR itself is obviously invalid, but should not be arbitrarily added
to all instances of `unreachable` (putting behind NDEBUG is fine).
This PR cleans this up a bit by changing all these error locations
to print `INTERNAL ERROR - IR Validity`, as well as adding a new
one to `emit_invoke` when there's a manifest mismatch in type
of the arguments and the signature. This new case is particularly
useful after the recent addition of ABIOverride, as that makes it
more likely that external AbstractInterpreters are doing ABI shenanigans.
Keno
added a commit
that referenced
this pull request
Jun 4, 2025
See #58429 for general discussion of these kinds of errors. We don't verify this one in the IR verifier, because it's currently not really part of the structural invariants of the IR. One could imagine using IRCode with a non-codegen backend that uses something else here. Nevertheless, codegen needs to complain of course if someone put something here it doesn't understand.
Keno
added a commit
that referenced
this pull request
Jun 4, 2025
See #58429 for general discussion of these kinds of errors. We don't verify this one in the IR verifier, because it's currently not really part of the structural invariants of the IR. One could imagine using IRCode with a non-codegen backend that uses something else here. Nevertheless, codegen needs to complain of course if someone put something here it doesn't understand.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What should codegen do when it detects invalid IR? There are a few reasonable options. One is to just assert - this is relatively straightforward but also not very friendly because it immediately takes down your session, so you can't inspect what values may have caused the issue. Additionally, often allow invalid IR in dead code (because otherwise transformation passes would have to check whether the transformation that they're doing makes some code dead, which can be expensive or not necessarily visible).
Thus we generally try to keep codegen going, trying to bail back to valid code as quickly as possible. In a few places, we additionally insert an unmodeled
emit_error("Internal Error"). These are a bit weird because the earlier stages of the compiler do not know that they can exist, but in practice they mostly work fine (although they can cause additional crashes if there are try/catches in the way). If we don't, then we generally just stop codegening, so if execution ever were to reach there, it'll just run into whatever code comes after, likely crashing fairly soon.Because of this consideration, the
emit_erroris generally preferred. However, the tradeoff is that it increases the size of the code and LLVM can no longer optimize under the assumption that a particular code branch doesn't happen. We thus need to be judicious in where we use it. The general guidance is that it's fine to use in situations where the IR itself is obviously invalid, but should not be arbitrarily added to all instances ofunreachable(putting behind NDEBUG is fine).This PR cleans this up a bit by changing all these error locations to print
INTERNAL ERROR - IR Validity, as well as adding a new one toemit_invokewhen there's a manifest mismatch in type of the arguments and the signature. This new case is particularly useful after the recent addition of ABIOverride, as that makes it more likely that external AbstractInterpreters are doing ABI shenanigans.