Skip to content

[Bug #21941] YJIT: Fix local variables read as nil when EP escapes before YJIT is enabled#16306

Closed
paracycle wants to merge 1 commit intoruby:masterfrom
Shopify:uk-fix-yjit-ep-escape
Closed

[Bug #21941] YJIT: Fix local variables read as nil when EP escapes before YJIT is enabled#16306
paracycle wants to merge 1 commit intoruby:masterfrom
Shopify:uk-fix-yjit-ep-escape

Conversation

@paracycle
Copy link
Copy Markdown
Contributor

@paracycle paracycle commented Mar 4, 2026

Summary

Fixes a bug where local variables are read as nil in JIT-compiled code when RubyVM::YJIT.enable is called mid-execution (e.g. from a Rack config.ru while Puma's server loop is already on the call stack).

Bug

The bug requires three ingredients:

  1. A method is already on the call stack when RubyVM::YJIT.enable is called.
  2. Before YJIT.enable, something causes the method's environment pointer (EP) to escape to the heap (e.g., creating a lambda/proc that captures the environment). This calls vm_make_env_each, which copies local variable values from the stack to a heap-allocated environment and updates cfp->ep to point to the heap copy. The original stack slots (accessed via vm_base_ptr(cfp)) become stale.
  3. The method uses next inside a begin/rescue block wrapped in a begin/ensure block. This compiles to a throw instruction that enters YJIT through the jit_exception code path.

When vm_make_env_each runs, it calls rb_yjit_invalidate_ep_is_bp to notify YJIT that EP has escaped for this ISEQ. However, when YJIT was started with --yjit-disable (or not yet initialized), the INVARIANTS global is None, so rb_yjit_invalidate_ep_is_bp returns early as a no-op and the EP escape is never recorded.

Later, when YJIT compiles the jit_exception entry point, it checks iseq_escapes_ep(iseq) which returns false (no record of EP escape), so it assumes EP == BP and generates SP-based local variable access. SP-based access reads from vm_base_ptr(cfp) which points to the original (stale) stack location containing nil values, instead of cfp->ep which points to the heap-allocated environment with the real values.

Fix

Adds a runtime check in gen_entry_point: before compiling an entry, examine cfp->ep for the VM_ENV_FLAG_ESCAPED flag. If the flag is set but YJIT has no record of the escape, call rb_yjit_invalidate_ep_is_bp to register it. This causes subsequent code generation to use EP-based local access which correctly reads from the heap-allocated environment.

Test

Adds a regression test that reproduces the exact scenario: a lambda is created (causing EP escape), YJIT.enable is called mid-method, and next inside begin/rescue/ensure triggers the jit_exception path. Without the fix, i becomes nil after ~30 iterations when YJIT compiles the exception handler.


NOTE: The bug was found and the fix was made by Claude Opus 4.6 under my guidance and using the minimal reproduction script in the bug ticket.

@matzbot matzbot requested a review from a team March 4, 2026 23:11
@paracycle paracycle changed the title YJIT: Fix local variables read as nil when EP escapes before YJIT is enabled [Bug #124920] YJIT: Fix local variables read as nil when EP escapes before YJIT is enabled Mar 4, 2026
@paracycle paracycle force-pushed the uk-fix-yjit-ep-escape branch from cc02218 to 86557d2 Compare March 4, 2026 23:22
@paracycle paracycle changed the title [Bug #124920] YJIT: Fix local variables read as nil when EP escapes before YJIT is enabled [Bug #21941] YJIT: Fix local variables read as nil when EP escapes before YJIT is enabled Mar 4, 2026
@paracycle paracycle force-pushed the uk-fix-yjit-ep-escape branch from 86557d2 to 0111e3e Compare March 4, 2026 23:26
yjit/src/core.rs Outdated
let ep = unsafe { get_cfp_ep(cfp) };
let ep_flags = unsafe { (*ep).0 };
if ep_flags & (VM_ENV_FLAG_ESCAPED as usize) != 0 {
rb_yjit_invalidate_ep_is_bp(iseq);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong in two ways:

  • rb_yjit_invalidate_ep_is_bp() is only meant to be called by the interpreter and runtime code. The weird reentrance from here when we're in the compiler might create issues with &mut on global states. At minimum it's doing more work than it should -- we are not looking to invalidate anything here, just to change what iseq_escapes_ep() returns
  • This makes the false assumption that the running CFP is the one being compiled. That's not necessarily the case, since YJIT can look at multiple iseqs during one compile run. This might fix iseq_escapes_ep() for the running iseq but we might have the same issue elsewhere.

We should figure out a way to have iseq_escapes_ep() be reliable in face of delayed enablement. Or use a more correct signal. I get the feeling that the correct way to interpret the return value from iseq_escapes_ep() is "yes" and "don't know" and YJIT was misinterpreting "don't know" as "never".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! You're right on both counts.

I've reworked the fix based on your insight that iseq_escapes_ep() returns "yes" and "don't know", not "yes" and "no".

The new approach:

  • Added a YJIT_ENABLED_AT_BOOT flag that's set in rb_yjit_init(true) (the --yjit path) but not in rb_yjit_enable() (the lazy path).
  • Changed iseq_escapes_ep() to use !yjit_enabled_at_boot_p() as the default for untracked ISEQs. When YJIT was enabled at boot, "not in map" means "never escaped" (safe). When YJIT was enabled lazily, "not in map" means "don't know" → conservatively report as escaped.
  • Removed the gen_entry_point hack entirely — no more calling rb_yjit_invalidate_ep_is_bp() from the compiler, no more assuming the running CFP is the one being compiled.

This is entirely within YJIT's Rust code. The EP==BP optimization is only lost for ISEQs that existed before lazy enablement and haven't had any block compiled yet — which is the correct conservative behavior since we can't know if their EP escaped before we started watching.

…fore YJIT is enabled

When YJIT is enabled lazily via RubyVM::YJIT.enable (or --yjit-disable),
EP escape notifications from vm_make_env_each() are silently dropped
because the YJIT invariants system is not yet initialized. This causes
iseq_escapes_ep() to return false for ISEQs whose EP has already escaped,
leading YJIT to generate SP-based local variable access that reads stale
nil values instead of the correct heap-allocated environment.

Fix by tracking whether YJIT was enabled at boot. When it was not,
iseq_escapes_ep() conservatively returns true for ISEQs with no tracking
entry, since we cannot know whether their EP escaped before YJIT started
observing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants