Skip to content

JIT: Preference SDSU intervals away from killed registers#125219

Open
jakobbotsch wants to merge 10 commits intodotnet:mainfrom
jakobbotsch:preference-sdsu-intervals
Open

JIT: Preference SDSU intervals away from killed registers#125219
jakobbotsch wants to merge 10 commits intodotnet:mainfrom
jakobbotsch:preference-sdsu-intervals

Conversation

@jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Mar 5, 2026

This revives #81242 on top of #125214. I noticed some regressions with #125214 because we were missing this logic (specifically case 5 when resolving DU conflicts). Also, this may have some TP implications that I want to look at separately from the other change.

Example:

 G_M40772_IG01:
-       sub      rsp, 40
-       xor      eax, eax
-       mov      qword ptr [rsp+0x20], rax
-						;; size=11 bbWeight=1 PerfScore 1.50
+       push     rbx
+       sub      rsp, 32
+						;; size=5 bbWeight=1 PerfScore 1.25
 G_M40772_IG02:
        call     [CORINFO_HELP_READYTORUN_GCSTATIC_BASE]
-       mov      bword ptr [rsp+0x20], rax
+       mov      rbx, rax
        call     [System.Reflection.Internal.PooledStringBuilder:CreatePool():System.Reflection.Internal.ObjectPool`1[System.Reflection.Internal.PooledStringBuilder]]
-       mov      rcx, bword ptr [rsp+0x20]
+       mov      rcx, rbx
        mov      rdx, rax
        call     [CORINFO_HELP_ASSIGN_REF]
        nop      
-						;; size=32 bbWeight=1 PerfScore 11.50
+						;; size=28 bbWeight=1 PerfScore 10.00
 G_M40772_IG03:
-       add      rsp, 40
+       add      rsp, 32
+       pop      rbx
        ret      
-						;; size=5 bbWeight=1 PerfScore 1.25
+						;; size=6 bbWeight=1 PerfScore 1.75
 
-; Total bytes of code 48, prolog size 11, PerfScore 14.25, instruction count 12, allocated bytes for code 48 (MethodHash=bb6e60bb) for method System.Reflection.Internal.PooledStringBuilder:.cctor() (FullOpts)
+; Total bytes of code 39, prolog size 5, PerfScore 13.00, instruction count 12, allocated bytes for code 39 (MethodHash=bb6e60bb) for method System.Reflection.Internal.PooledStringBuilder:.cctor() (FullOpts)
 ; ============================================================

A def-use conflict happens for SDSU intervals when the def and use share
no common register. For these cases we run through a set of potential
ways to resolve the conflict that make use of the contract between LSRA
and codegen. For example, when the definition requires a single fixed
register, it is still valid for LSRA to assign a different register.
This means that codegen can use the original fixed register, but that it
must insert a copy into the register assigned by LSRA.

In the worst case we cannot use any of the approaches to solve the
conflict and then we will fall back to spilling/copying as necessary.

It appears that several of the cases were overly conservative.
Particularly case 1 and 2, as described by the function header, were
never reachable; we assign `defRegConflict` and `useRegConflict` based
on whether there is a conflict in register constraints, but that is
always true when we get into this function. It appears the original
intention of these variables were whether there is a conflict with a
_different_ use of the registers we could otherwise have resolved the
original conflict with. This PR restores the meaning to that.
Copilot AI review requested due to automatic review settings March 5, 2026 12:49
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 5, 2026
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Revives prior LSRA heuristics so SDSU (single-def/single-use) temp intervals are preferentially allocated away from registers killed by the consuming node, addressing regressions observed after updated def/use conflict handling.

Changes:

  • Reworks resolveConflictingDefAndUse conflict tracking so cases like #5 become reachable again and adjusts fixed-reg handling in case #3.
  • Refactors kill-driven preference updates into updateIntervalPreferencesForKill and applies it to both live locals and in-flight (defList) non-local intervals.
  • Adds the new helper declaration to lsra.h.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/coreclr/jit/lsrabuild.cpp Updates SDSU def/use conflict resolution logic; refactors and extends kill-based register preference updates; adds helper implementation.
src/coreclr/jit/lsra.h Declares the new updateIntervalPreferencesForKill helper on LinearScan.
Comments suppressed due to low confidence (1)

src/coreclr/jit/lsrabuild.cpp:347

  • In case #4 we update useRefPosition->registerAssignment but keep useRefPosition->isFixedRegRef true. RegisterSelection::select() asserts that isFixedRegRef implies a single-bit registerAssignment, and keeping the fixed flag also means we’re effectively changing the fixed register requirement to defRegAssignment. With the new meaning of defRegConflict, this branch can now execute when defRegAssignment is multi-reg (or just different from the original fixed reg), which can violate the fixed-reg invariant and/or cause incorrect codegen assumptions. Consider either requiring defRefPosition->isFixedRegRef/isSingleRegister(defRegAssignment) for case #4, or clearing useRefPosition->isFixedRegRef (similar to case #3/#5) when you change its candidates away from the fixed register.
    if ((useReg != REG_NA) && !defRegConflict && canChangeUseAssignment)
    {
        // This is case #4.
        INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_DEFUSE_CASE4, interval));
        useRefPosition->registerAssignment = defRegAssignment;
        return;

Copilot AI review requested due to automatic review settings March 6, 2026 20:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

// Handled via liveness above
continue;
}

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new defList iteration applies updateIntervalPreferencesForKill to all non-local intervals, including float/double temps. That bypasses the existing heuristic in this method that intentionally avoids setting callee-save preferences for floating-point (see the comment just above and the fpCalleeSaveCandidateVars filtering for locals). As written, float/double LIR edges can now get preferCalleeSave set and have their preferences restricted away from call-killed regs, which may increase use of FP callee-saves and grow prologs/epilogs.

Consider applying the same filtering here as for locals (e.g., skip when the interval type is float/double unless you have a separate candidate heuristic for non-locals), or adjust updateIntervalPreferencesForKill to not set preferCalleeSave / not update prefs for float/double intervals when handling call kills.

Suggested change
// Do not update callee-save preferences for floating-point intervals.
// This mirrors the heuristic used for locals above, which avoids setting
// preferCalleeSave for most float/double values to prevent overuse of
// FP callee-save registers and larger prologs/epilogs.
if (varTypeIsFloating(interval->registerType))
{
continue;
}

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside is that we end up saving/restoring a callee saved float reg in prolog/epilog, which costs more than only spilling it in one place. The upside is that in many cases, the prolog/epilogs are cheaper than the spill that would otherwise happen inside the function.
Given how ad-hoc the existing heuristic seems I think I will leave this, if we see benchmark regressions at least we'll have some cases to look at for designing a heuristic.

@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @EgorBo

Diffs. When we ended up with a LIR edge that spanned a call we would not preference it into callee saves. The situation happens only rarely, but this enables the same logic that we have for locals on LIR intervals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants