In #5201, @afonso360 observed that currently, every Cranelift backend can only lower select_spectre_guard when paired with icmp, and asked if there are any plans to implement it more generally.
This instruction is currently only used in cranelift/codegen/src/legalizer/heap.rs and cranelift/codegen/src/legalizer/table.rs as part of the implementation of the heap_addr and table_addr instructions, respectively, when Spectre mitigations are enabled in the TargetIsa flags. And indeed, its first input is always produced by icmp.
I think we should consider fusing the icmp semantics into the definition of select_spectre_guard's behavior. That would clarify this question and also should give us a small compile-time speedup, due to processing fewer instructions and doing less work in instruction selection pattern-matching.
I believe every use of select_spectre_guard is also preceded by an additional copy of the exact same icmp, to feed into trapnz. Maybe we can generate better code at the backends if we fuse that sequence into this instruction as well?
The "obvious" way to fuse these instructions would be to make select_spectre_guard take two pairs of values, along with the condition code and maybe trap code. However, I believe that would make InstructionData bigger than 16 bytes, which we don't want to do.
The heap_addr legalization always uses an iconst 0 for the mispredict output, while table_addr uses the table base. Can we use a constant 0 address in both cases? Then we could define the fused instruction as taking only three Values, which I think should fit alongside a condition code. If we also have a trap code the story is trickier since those are four bytes as well.
This might also be part of, or an alternative to, #5190, so cc: @cfallin @fitzgen @alexcrichton
In #5201, @afonso360 observed that currently, every Cranelift backend can only lower
select_spectre_guardwhen paired withicmp, and asked if there are any plans to implement it more generally.This instruction is currently only used in
cranelift/codegen/src/legalizer/heap.rsandcranelift/codegen/src/legalizer/table.rsas part of the implementation of theheap_addrandtable_addrinstructions, respectively, when Spectre mitigations are enabled in theTargetIsaflags. And indeed, its first input is always produced byicmp.I think we should consider fusing the
icmpsemantics into the definition ofselect_spectre_guard's behavior. That would clarify this question and also should give us a small compile-time speedup, due to processing fewer instructions and doing less work in instruction selection pattern-matching.I believe every use of
select_spectre_guardis also preceded by an additional copy of the exact sameicmp, to feed intotrapnz. Maybe we can generate better code at the backends if we fuse that sequence into this instruction as well?The "obvious" way to fuse these instructions would be to make
select_spectre_guardtake two pairs of values, along with the condition code and maybe trap code. However, I believe that would makeInstructionDatabigger than 16 bytes, which we don't want to do.The
heap_addrlegalization always uses aniconst 0for the mispredict output, whiletable_addruses the table base. Can we use a constant 0 address in both cases? Then we could define the fused instruction as taking only threeValues, which I think should fit alongside a condition code. If we also have a trap code the story is trickier since those are four bytes as well.This might also be part of, or an alternative to, #5190, so cc: @cfallin @fitzgen @alexcrichton