Skip to content

fix(layout): dont wrap coroutines fields in MaybeUninit#153620

Closed
weihanglo wants to merge 3 commits intorust-lang:mainfrom
weihanglo:layout-iterative
Closed

fix(layout): dont wrap coroutines fields in MaybeUninit#153620
weihanglo wants to merge 3 commits intorust-lang:mainfrom
weihanglo:layout-iterative

Conversation

@weihanglo
Copy link
Member

@weihanglo weihanglo commented Mar 9, 2026

When computing coroutine layout, each saved local is wrapped in MaybeUninit, which chains through ManuallyDrop and MaybeDangling before reaching T. This adds 3 extra layout_of queries per field.

For deeply nested async functions (e.g. matrix-sdk with many #[instrument]-ed async calls), this extra MaybeUninit accumulated across all nesting levels and easily overflowed the default recursion limit (128 as of 1.94)

Instead, compute layout_of(T) directly and apply the two effects that MaybeUninit provided:

  • hide niches:
    uninitialized fields have no valid_range guarantees,
    so reset scalar ranges to their full width
    like other is_special_no_niche types.
  • mark variants inhabited:
    MaybeUninit hardcoded uninhabited: false, so we follow.
    Otherwise the field may be seen as unreachable and get optimized away

According to WaffleLapkin's analysis, the extra layer added 32 queries in the repro. Along with this fix, as we completely remove MaybeUninit, I guess the queries will be down to 131 - (32 * 3) = 35 queries. We save around 96 queries! See #153620 (comment)

Fixes #152942

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 9, 2026
@rustbot
Copy link
Collaborator

rustbot commented Mar 9, 2026

WaffleLapkin is not on the review rotation at the moment.
They may take a while to respond.

@weihanglo weihanglo marked this pull request as ready for review March 9, 2026 17:37
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 9, 2026
@weihanglo
Copy link
Member Author

😭 it failed.

@rust-log-analyzer

This comment has been minimized.

When computing coroutine layout,
each saved local is wrapped in `MaybeUninit<T>`,
which chains through `ManuallyDrop<T>`
and `MaybeDangling<T>` before reaching `T`.
This adds 3 extra layout_of queries per field.

For deeply nested async functions
(e.g. matrix-sdk with many `#[instrument]`-ed async calls),
this extra `MaybeUninit` accumulated across all nesting levels
and easily overflowed the default recursion limit (128 as of 1.94)

Issue: rust-lang/rust 152942
When computing coroutine layout,
each saved local is wrapped in `MaybeUninit<T>`,
which chains through `ManuallyDrop<T>`
and `MaybeDangling<T>` before reaching `T`.
This adds 3 extra layout_of queries per field.

For deeply nested async functions
(e.g. matrix-sdk with many `#[instrument]`-ed async calls),
this extra `MaybeUninit` accumulated across all nesting levels
and easily overflowed the default recursion limit (128 as of 1.94)

Instead, compute `layout_of(T)` directly and apply the two effects
that `MaybeUninit` provided:

* hide niches:
  uninitialized fields have no valid_range guarantees,
  so reset scalar ranges to their full width
  like other `is_special_no_niche` types.
* mark variants inhabited:
  `MaybeUninit` hardcoded `uninhabited: false`, so we follow.
  Otherwise the field may be seen as unreachable and get optimized away

Issue: rust-lang/rust 152942
@weihanglo weihanglo changed the title fix(layout): pre-compute coroutine field layouts iteratively fix(layout): dont wrap coroutines fields in MaybeUninit Mar 10, 2026
@weihanglo
Copy link
Member Author

weihanglo commented Mar 10, 2026

My repro.

async fn l1()  { l2().await }
async fn l2()  { l3().await }
async fn l3()  { l4().await }
async fn l4()  { l5().await }
async fn l5()  { l6().await }
async fn l6()  { l7().await }
async fn l7()  { l8().await }
async fn l8()  { l9().await }
async fn l9()  { l10().await }
async fn l10() { l11().await }
async fn l11() { l12().await }
async fn l12() { l13().await }
async fn l13() { l14().await }
async fn l14() { l15().await }
async fn l15() { l16().await }
async fn l16() { l17().await }
async fn l17() { l18().await }
async fn l18() { l19().await }
async fn l19() { l20().await }
async fn l20() { l21().await }
async fn l21() { l22().await }
async fn l22() { l23().await }
async fn l23() { l24().await }
async fn l24() { l25().await }
async fn l25() { l26().await }
async fn l26() { l27().await }
async fn l27() { l28().await }
async fn l28() { l29().await }
async fn l29() { l30().await }
async fn l30() { l31().await }
async fn l31() { l32().await }
async fn l32() { l33().await }
async fn l33() { l34().await }
async fn l34() { l35().await }
async fn l35() {}

fn main() { let _ = l1(); }

And with the latest nightly b41f22d

$ rustc +nightly --edition 2024 repro.rs
error: queries overflow the depth limit!
  |
  = help: consider increasing the recursion limit by adding a `#![recursion_limit = "256"]` attribute to your crate (`repro`)
  = note: query depth increased by 130 when computing layout of `{async fn body of l1()}`
  
error: aborting due to 1 previous error

With this PR:

$ rustc +stage1 --edition 2024 repro.rs -Z crate-attr='recursion_limit="70"'
# succeeded

All the way down to recursion_limit = "36", we got an unrelated one:

$ rustc +stage1 --edition 2024 repro.rs -Z crate-attr='recursion_limit="36"'
error[E0275]: overflow normalizing the opaque type `l19::{opaque#0}`
  |
  = help: consider increasing the recursion limit by adding a `#![recursion_limit = "72"]` attribute to your crate (`repro`)

error: aborting due to 1 previous error

And then at 35, we see the familiar error. We made it down to 35 queries!

$ rustc +stage1 --edition 2024 repro.rs -Z crate-attr='recursion_limit="35"'
error: queries overflow the depth limit!
  |
  = help: consider increasing the recursion limit by adding a `#![recursion_limit = "70"]` attribute to your crate (`repro`)
  = note: query depth increased by 37 when computing layout of `{async fn body of l1()}`

error: aborting due to 1 previous error

@weihanglo
Copy link
Member Author

cc @RalfJung I believe you are the expert of niche and variant inhabiting. Let me know if this fixes the issue correctly without changing any layout semantic :)

@RalfJung
Copy link
Member

RalfJung commented Mar 10, 2026

I don't know much about enum layout handling and niches, no.

However, what you are describing sounds problematic from the opsem perspective. Coroutine fields can be uninitialized. If you make the field types lie about the values stored there, that'd be UB. I am not sure I understand the PR description.

@RalfJung
Copy link
Member

RalfJung commented Mar 10, 2026

MaybeUninit, which chains through ManuallyDrop and MaybeDangling before reaching T. This adds 3 extra layout_of queries per field.

This sounds like it'd affect all MaybeUninit usage, not just coroutines. Why is it a good idea to fix this in a coroutine-specific way?

so reset scalar ranges to their full width

unions like MaybeUninit don't just reset the range to the full width, they use Scalar::Union to indicate that the scalar may be uninitialized.

let mut data = rustc_abi::LayoutData::clone(&layout.layout.0);
data.hide_niches(cx.data_layout());
data.uninhabited = false;
Ok(TyAndLayout { ty: layout.ty, layout: tcx.mk_layout(data) })
Copy link
Member

@RalfJung RalfJung Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a bad idea to lie about the type here. Every part of the compiler that traverses types through their fields and assumes that the types say something reasonable about the data will have to be special-cased. In its current form this is just plain unsound, and even if you fix all parts of the compiler that do such traversals today, you are setting up a massive footgun for future type traversals.

Copy link
Member

@RalfJung RalfJung Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are at least 3 soundness bugs in this PR as-is:

  • It fails to mark potentially-uninitialized fields as Scalar::Uninitialized.
  • The type validation logic in Miri will still recurse into this field using its reported type and then report UB due to things being uninitialized.
  • The MaybeDangling logic @WaffleLapkin recently added to the code that determines LLVM attributes is also bypassed and no longer correctly applies to these fields.

It is also IMO just a bad idea to duplicate extemely subtle correctness-critical code such as what exactly MaybeUninit does to type layout and validity. Code duplication is obviously always bad but it is even worse when there are non-local invariants involved like here (as demonstrated by the fact that your PR duplicates the existing logic incorrectly).

So, playing whack-a-mole with soundness bugs is IMO not the way to progress here. We need a fundamentally different approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Ralf, this special case looks very error prone.

I think at most we could special case MaybeDangling in layout_of_union but even that feels questionable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make MaybeUninit a language primitive... though I am not sure how to best use that to avoid the nesting in the field types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look, folks! And sorry that this is a bad fix.

I was looking at these pull requests:

and thought MaybeUninit was used to mark inhabited = true. I knew I must have missed a lot of things 😞. Anyway, the reviews here are useful for understanding how load-bearing this part of the code is.

@oli-obk
Copy link
Contributor

oli-obk commented Mar 10, 2026

@bors try @rust-timer queue

just out of curiosity

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rust-bors
Copy link
Contributor

rust-bors bot commented Mar 10, 2026

⌛ Trying commit d98c014 with merge c5e2500

To cancel the try build, run the command @bors try cancel.

Workflow: https://github.com/rust-lang/rust/actions/runs/22893188397

rust-bors bot pushed a commit that referenced this pull request Mar 10, 2026
fix(layout): dont wrap coroutines fields in MaybeUninit
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 10, 2026
@rust-bors rust-bors bot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 10, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Mar 10, 2026

💥 Test timed out after 21600s

@weihanglo
Copy link
Member Author

Closing as this is a pre-mature fix that didn't consider some other parts of effects that MaybeUninit has. See Ralf's comment for more #153620 (comment).

@weihanglo weihanglo closed this Mar 10, 2026
@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 10, 2026
@weihanglo
Copy link
Member Author

For further discussions of other possibilities, please look at the original issue #152942 so we can consolidate all discussions. Thank you.

@weihanglo weihanglo deleted the layout-iterative branch March 10, 2026 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-perf Status: Waiting on a perf run to be completed. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Query depth limit overflow when building matrix_sdk

7 participants