Skip to content

Conversation

@eduardo-vp
Copy link
Member

@eduardo-vp eduardo-vp commented Jan 6, 2026

Flag x2 (see async calling convention) during GC as it might contain an async continuation.

Contributes to #122492.

Copilot AI review requested due to automatic review settings January 6, 2026 21:57
@eduardo-vp eduardo-vp requested review from VSadov and removed request for MichalStrehovsky and Copilot January 6, 2026 21:57
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes thread hijacking on ARM64 by preserving the x2 register, which may contain async continuation values. The key change is moving the thread pointer from x2 to x9 to avoid conflicts, and updating the probe frame to save and restore x2 along with x0 and x1.

Key Changes:

  • Changed thread register from x2 to x9 in hijacking code to preserve x2 for async continuations
  • Expanded probe frame to save/restore x2 register (frame size increased from 0xD0 to 0xD8 bytes)
  • Added GETTHREAD_ETLS_9 macro and PTFF_SAVE_X2 flag for proper thread pointer retrieval and register saving

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/coreclr/nativeaot/Runtime/unix/unixasmmacrosarm64.inc Adds GETTHREAD_ETLS_9 macro to retrieve thread pointer in x9 while preserving x0-x3, and defines PTFF_SAVE_X2 flag
src/coreclr/nativeaot/Runtime/arm64/GcProbe.asm Updates probe frame layout to include x2, changes thread register from x2 to x9, adjusts all offsets and register references, and fixes orr->mov instruction
src/coreclr/nativeaot/Runtime/arm64/GcProbe.S Parallel changes to GcProbe.asm for Unix assembly syntax

@agocke
Copy link
Member

agocke commented Jan 7, 2026

@VSadov @jakobbotsch is there some doc on the async calling convention? I don't know how to review this change because I don't know what the appropriate callee/caller save/kill registers are.

@VSadov
Copy link
Member

VSadov commented Jan 7, 2026

@VSadov @jakobbotsch is there some doc on the async calling convention? I don't know how to review this change because I don't know what the appropriate callee/caller save/kill registers are.

The calling convention is documented in the BOTR. The relevant section about returns is:
https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/clr-abi.md#returning-continuation

For the purpose of this PR the relevant part is that the special handling of r0 and r1 now needs to be generalized to r2 as well.

What makes it a bit less trivial is that the hijacking itself assumes that r2 is dead/undefined at the point of return and we may be using r2 as a temp. That should switch to use some other scratch register.

@eduardo-vp eduardo-vp changed the title Fix up hijacking on arm64 Fix up hijacking on arm64 (preserve async continuation register) Jan 9, 2026
;;
NESTED_ENTRY RhpGcStressProbe
PUSH_PROBE_FRAME x2, x3, x12
PUSH_PROBE_FRAME x4, x3, x12
Copy link
Member

@VSadov VSadov Jan 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RhpGcStressProbe is dead code after we switched suspension machinery to be like on CoreCLR a few releases ago.

The preexisting GcStress code in NAOT assumes cooperative suspension with loop polling/hijacking and therefore incompatible with what we have now. It was not high in priorities to remove it, hoping maybe at least some portion could be useful (doubtful).

Perhaps should just remove it (in a separate change of course), so noone needs to worry about dead code in changes like this.

@eduardo-vp
Copy link
Member Author

Since there's no need to preserve x3, I removed GETTHREAD_ETLS_4 and used GETTHREAD_ETLS_3 \ INLINE_GETTHREAD x3 instead. However later in RhpGcProbeHijack, x3 is used to store RhpTrapThreads so there's a conflict there. I'm just moving the thread pointer from x3 to x4, such that we have the thread pointer in x4 and the value of RhpTrapThreads in the lower bits of x3 which are used in PUSH_PROBE_FRAME x4, x3, x12.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants