Streamline active job collection. by nnethercote · Pull Request #153650 · rust-lang/rust

nnethercote · 2026-03-10T10:50:44Z

collect_active_jobs_from_all_queries takes a require_complete bool, and then some callers expect a full map result while others allow a partial map result. The end result is four possible combinations, but only three of them are used/make sense.

This commit introduces CollectActiveJobsKind, a three-value enum that describes the three sensible combinations, and rewrites collect_active_jobs_from_all_queries around it. This makes it and its call sites much clearer, and removes the weird Option<()> and Result<QueryJobMap, QueryJobMap> return types.

Other changes of note.

active is removed. The comment about make_frame is out of date, and create_deferred_query_stack_frame is safe to call with the query state locked.
When shard locking failure is allowed, collection no longer stops on the first failed shard.

nnethercote · 2026-03-10T10:51:04Z

cc @Zalathar @Zoxc @zetanumbers

Zalathar · 2026-03-10T11:44:40Z

Is it correct to radically change the shard locking behaviour like this?

petrochenkov · 2026-03-10T13:45:59Z

Looks like this effectively reverts #148777, which was a bugfix specifically relying on calling lock_shards instead of try_lock_shards.
cc @ywxt to confirm

Having the same thing in both the Ok and Err cases was strange.

I disagree, the first thing is a good map, the second thing is an erroneous map for diagnostics.
Result here works as usual and doesn't allow forgetting about the error checking, unlike the boolean.
(Replacing Option<()> with bools seems ok.)

nnethercote · 2026-03-10T19:47:59Z

Thanks for the info! Looks like making this a draft PR was the right decision :)

I still don't like the code structure. Even with the comments it was strange enough that I misunderstood what it was doing. I'll try to find another approach to improve that while preserving the try_lock/lock distinction.

I also wonder why a test wasn't added in #148777. My understanding of parallel front-end testing is poor, this might force me to look into it.

petrochenkov · 2026-03-10T20:01:32Z

The testing isn't even merged yet, the relevant PR is #143953.
Zulip thread: #t-compiler/parallel-rustc > Add the parallel front-end test suite

nnethercote · 2026-03-10T22:49:29Z

The require_complete boolean plus the use/non-use of expect/unwrap on the result means there are four possible scenarios for this function.

// - unwrap: need the full map
// - true => lock_shards: willing to wait for locks, i.e. there might be
//   contention(?)
// - used from:
//   - depth_limit_error
collect_active_jobs_from_all_queries(tcx, true).unwrap()

// - unwrap: need the full map
// - false => try_lock_shards: not willing to wait for locks, i.e. we assume
//   there is no contention(?)
// - used from:
//   - the deadlock handler (in `rustc_in_thread_pool_with_globals`)
//   - cycle_error
collect_active_jobs_from_all_queries(tcx, false).unwrap()

// - unwrap_or_else: partial map is acceptable
// - true => lock_shards: willing to wait for locks, i.e. there might be
//   contention
// - used from:
//   - (nowhere)
collect_active_jobs_from_all_queries(tcx, true).unwrap_or_else(|m| m)

// - unwrap_or_else: partial map is acceptable
// - false => try_lock_shards: not willing to wait for locks, i.e. we assume
//   there is no contention(?)
// - used from:
//   - print_query_stack
collect_active_jobs_from_all_queries(tcx, true).unwrap_or_else(|m| m)

Only three of them are used in practice. Maybe require_complete should be replaced by a three-value enum like this:

enum CollectKind {
    Full,
    FullNoContention,
    PartialAllowed,
}

and collect_active_jobs_from_all_queries could just return the (full or partial) QueryJobMap instead of a Result.

Zalathar · 2026-03-11T00:21:39Z

I think the bottom two cases are confused. The print_query_stack function passes false.

Also IIRC, when true is passed, the function will always return Ok, so the true/unwrap-or-else combination is unused because it's inherently useless.

nnethercote · 2026-03-11T01:22:06Z

I think the bottom two cases are confused. The print_query_stack function passes false.

Fixed, thanks.

Also IIRC, when true is passed, the function will always return Ok, so the true/unwrap-or-else combination is unused because it's inherently useless.

Good point. All the more reason for the enum.

ywxt · 2026-03-11T01:24:45Z

require_complete=true only is used in depth_limit_error, which ensures us can get the depth and root of queries from a complete job map.

Remove the require_complete parameter. We now always use try_lock_shards and let the caller abort on incompleteness

I don't think it's acceptable that the compiler panics during reporting the depth limit.

ywxt · 2026-03-11T01:28:07Z

enum CollectKind {
    Full,
    FullNoContention,
    PartialAllowed,
}

that's better than a bool + Result 👍

nnethercote · 2026-03-11T06:34:51Z

I have updated the code to use the three-value enum. I'm not 100% sure about the names, happy to hear alternative suggestions there.

zetanumbers · 2026-03-11T08:31:47Z

I also wonder why a test wasn't added in #148777. My understanding of parallel front-end testing is poor, this might force me to look into it.

Tests for A-parallel-compiler bugs are nearly impossible to make consistent. They shouldn't be relied upon unless you can relatively consistently reproduce a test failure on your branch's base commit and then can no longer reproduce the test failure on the branch itself. I even have fixed bugs without being able to reproduce A-parallel-compiler issue to begin with, just by diagnosing the source problem.

`collect_active_jobs_from_all_queries` takes a `require_complete` bool, and then some callers `expect` a full map result while others allow a partial map result. The end result is four possible combinations, but only three of them are used/make sense. This commit introduces `CollectActiveJobsKind`, a three-value enum that describes the three sensible combinations, and rewrites `collect_active_jobs_from_all_queries` around it. This makes it and its call sites much clearer, and removes the weird `Option<()>` and `Result<QueryJobMap, QueryJobMap>` return types. Other changes of note. - `active` is removed. The comment about `make_frame` is out of date, and `create_deferred_query_stack_frame` *is* safe to call with the query state locked. - When shard locking failure is allowed, collection no longer stops on the first failed shard.

Zalathar mentioned this pull request Mar 11, 2026

Introduce for_each_query_vtable! to move more code out of query macros #153685

Open

nnethercote force-pushed the collect_active-FIDDLING branch from cb7eef1 to 48bfc79 Compare March 11, 2026 06:32

This comment has been minimized.

Sign in to view

nnethercote force-pushed the collect_active-FIDDLING branch from 48bfc79 to 454e301 Compare March 11, 2026 11:11

Uh oh!

Conversation

nnethercote commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nnethercote commented Mar 10, 2026

Uh oh!

Zalathar commented Mar 10, 2026

Uh oh!

petrochenkov commented Mar 10, 2026

Uh oh!

nnethercote commented Mar 10, 2026

Uh oh!

petrochenkov commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nnethercote commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zalathar commented Mar 11, 2026

Uh oh!

nnethercote commented Mar 11, 2026

Uh oh!

ywxt commented Mar 11, 2026

Uh oh!

ywxt commented Mar 11, 2026

Uh oh!

nnethercote commented Mar 11, 2026

Uh oh!

zetanumbers commented Mar 11, 2026

Uh oh!

This comment has been minimized.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

nnethercote commented Mar 10, 2026 •

edited

Loading

petrochenkov commented Mar 10, 2026 •

edited

Loading

nnethercote commented Mar 10, 2026 •

edited

Loading