fix(layout): dont wrap coroutines fields in MaybeUninit#153620
fix(layout): dont wrap coroutines fields in MaybeUninit#153620weihanglo wants to merge 3 commits intorust-lang:mainfrom
Conversation
|
|
|
😭 it failed. |
This comment has been minimized.
This comment has been minimized.
When computing coroutine layout, each saved local is wrapped in `MaybeUninit<T>`, which chains through `ManuallyDrop<T>` and `MaybeDangling<T>` before reaching `T`. This adds 3 extra layout_of queries per field. For deeply nested async functions (e.g. matrix-sdk with many `#[instrument]`-ed async calls), this extra `MaybeUninit` accumulated across all nesting levels and easily overflowed the default recursion limit (128 as of 1.94) Issue: rust-lang/rust 152942
When computing coroutine layout, each saved local is wrapped in `MaybeUninit<T>`, which chains through `ManuallyDrop<T>` and `MaybeDangling<T>` before reaching `T`. This adds 3 extra layout_of queries per field. For deeply nested async functions (e.g. matrix-sdk with many `#[instrument]`-ed async calls), this extra `MaybeUninit` accumulated across all nesting levels and easily overflowed the default recursion limit (128 as of 1.94) Instead, compute `layout_of(T)` directly and apply the two effects that `MaybeUninit` provided: * hide niches: uninitialized fields have no valid_range guarantees, so reset scalar ranges to their full width like other `is_special_no_niche` types. * mark variants inhabited: `MaybeUninit` hardcoded `uninhabited: false`, so we follow. Otherwise the field may be seen as unreachable and get optimized away Issue: rust-lang/rust 152942
9a9d0fb to
d98c014
Compare
|
My repro. async fn l1() { l2().await }
async fn l2() { l3().await }
async fn l3() { l4().await }
async fn l4() { l5().await }
async fn l5() { l6().await }
async fn l6() { l7().await }
async fn l7() { l8().await }
async fn l8() { l9().await }
async fn l9() { l10().await }
async fn l10() { l11().await }
async fn l11() { l12().await }
async fn l12() { l13().await }
async fn l13() { l14().await }
async fn l14() { l15().await }
async fn l15() { l16().await }
async fn l16() { l17().await }
async fn l17() { l18().await }
async fn l18() { l19().await }
async fn l19() { l20().await }
async fn l20() { l21().await }
async fn l21() { l22().await }
async fn l22() { l23().await }
async fn l23() { l24().await }
async fn l24() { l25().await }
async fn l25() { l26().await }
async fn l26() { l27().await }
async fn l27() { l28().await }
async fn l28() { l29().await }
async fn l29() { l30().await }
async fn l30() { l31().await }
async fn l31() { l32().await }
async fn l32() { l33().await }
async fn l33() { l34().await }
async fn l34() { l35().await }
async fn l35() {}
fn main() { let _ = l1(); }And with the latest nightly b41f22d $ rustc +nightly --edition 2024 repro.rs
error: queries overflow the depth limit!
|
= help: consider increasing the recursion limit by adding a `#![recursion_limit = "256"]` attribute to your crate (`repro`)
= note: query depth increased by 130 when computing layout of `{async fn body of l1()}`
error: aborting due to 1 previous errorWith this PR: $ rustc +stage1 --edition 2024 repro.rs -Z crate-attr='recursion_limit="70"'
# succeededAll the way down to $ rustc +stage1 --edition 2024 repro.rs -Z crate-attr='recursion_limit="36"'
error[E0275]: overflow normalizing the opaque type `l19::{opaque#0}`
|
= help: consider increasing the recursion limit by adding a `#![recursion_limit = "72"]` attribute to your crate (`repro`)
error: aborting due to 1 previous errorAnd then at 35, we see the familiar error. We made it down to 35 queries! $ rustc +stage1 --edition 2024 repro.rs -Z crate-attr='recursion_limit="35"'
error: queries overflow the depth limit!
|
= help: consider increasing the recursion limit by adding a `#![recursion_limit = "70"]` attribute to your crate (`repro`)
= note: query depth increased by 37 when computing layout of `{async fn body of l1()}`
error: aborting due to 1 previous error |
|
cc @RalfJung I believe you are the expert of niche and variant inhabiting. Let me know if this fixes the issue correctly without changing any layout semantic :) |
|
I don't know much about enum layout handling and niches, no. However, what you are describing sounds problematic from the opsem perspective. Coroutine fields can be uninitialized. If you make the field types lie about the values stored there, that'd be UB. I am not sure I understand the PR description. |
This sounds like it'd affect all MaybeUninit usage, not just coroutines. Why is it a good idea to fix this in a coroutine-specific way?
unions like MaybeUninit don't just reset the range to the full width, they use |
| let mut data = rustc_abi::LayoutData::clone(&layout.layout.0); | ||
| data.hide_niches(cx.data_layout()); | ||
| data.uninhabited = false; | ||
| Ok(TyAndLayout { ty: layout.ty, layout: tcx.mk_layout(data) }) |
There was a problem hiding this comment.
I think it is a bad idea to lie about the type here. Every part of the compiler that traverses types through their fields and assumes that the types say something reasonable about the data will have to be special-cased. In its current form this is just plain unsound, and even if you fix all parts of the compiler that do such traversals today, you are setting up a massive footgun for future type traversals.
There was a problem hiding this comment.
I think there are at least 3 soundness bugs in this PR as-is:
- It fails to mark potentially-uninitialized fields as
Scalar::Uninitialized. - The type validation logic in Miri will still recurse into this field using its reported type and then report UB due to things being uninitialized.
- The MaybeDangling logic @WaffleLapkin recently added to the code that determines LLVM attributes is also bypassed and no longer correctly applies to these fields.
It is also IMO just a bad idea to duplicate extemely subtle correctness-critical code such as what exactly MaybeUninit does to type layout and validity. Code duplication is obviously always bad but it is even worse when there are non-local invariants involved like here (as demonstrated by the fact that your PR duplicates the existing logic incorrectly).
So, playing whack-a-mole with soundness bugs is IMO not the way to progress here. We need a fundamentally different approach.
There was a problem hiding this comment.
I agree with Ralf, this special case looks very error prone.
I think at most we could special case MaybeDangling in layout_of_union but even that feels questionable.
There was a problem hiding this comment.
We could make MaybeUninit a language primitive... though I am not sure how to best use that to avoid the nesting in the field types.
There was a problem hiding this comment.
Thanks for taking a look, folks! And sorry that this is a bad fix.
I was looking at these pull requests:
- Coroutine variant fields can be uninitialized #118871
- Supress niches in coroutines to avoid aliasing violations #129313
and thought MaybeUninit was used to mark inhabited = true. I knew I must have missed a lot of things 😞. Anyway, the reviews here are useful for understanding how load-bearing this part of the code is.
|
@bors try @rust-timer queue just out of curiosity |
|
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
|
⌛ Trying commit d98c014 with merge c5e2500… To cancel the try build, run the command Workflow: https://github.com/rust-lang/rust/actions/runs/22893188397 |
fix(layout): dont wrap coroutines fields in MaybeUninit
|
💥 Test timed out after |
|
Closing as this is a pre-mature fix that didn't consider some other parts of effects that MaybeUninit has. See Ralf's comment for more #153620 (comment). |
|
For further discussions of other possibilities, please look at the original issue #152942 so we can consolidate all discussions. Thank you. |
When computing coroutine layout, each saved local is wrapped in
MaybeUninit, which chains throughManuallyDropandMaybeDanglingbefore reachingT. This adds 3 extra layout_of queries per field.For deeply nested async functions (e.g. matrix-sdk with many
#[instrument]-ed async calls), this extraMaybeUninitaccumulated across all nesting levels and easily overflowed the default recursion limit (128 as of 1.94)Instead, compute
layout_of(T)directly and apply the two effects thatMaybeUninitprovided:uninitialized fields have no valid_range guarantees,
so reset scalar ranges to their full width
like other
is_special_no_nichetypes.MaybeUninithardcodeduninhabited: false, so we follow.Otherwise the field may be seen as unreachable and get optimized away
According to WaffleLapkin's analysis, the extra layer added 32 queries in the repro. Along with this fix, as we completely remove
MaybeUninit, I guess the queries will be down to131 - (32 * 3) = 35queries. We save around 96 queries! See #153620 (comment)Fixes #152942