Skip to content

Conversation

@saethlin
Copy link
Member

@saethlin saethlin commented Oct 5, 2025

The query mir_callgraph_cyclic is supposed to find all callees that may lead to a recursive call back to the given LocalDefId. But that query was built using a function which recurses through the call graph and tries to locally handle hitting the recursion limit during the walk. That is wrong. If the recursion limit is encountered, the set may be incomplete and thus useless. If we hit the recursion limit the only correct thing to do is bail.

Some benchmarks improve because for some functions we will bail out of the call graph walk faster. Some benchmarks regress because we do less inlining, but that is quite rare with the default recursion depth.

Originally I thought this might be a fix for #131960, but it turns out that it is actually a fix for #146998.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 5, 2025
@saethlin
Copy link
Member Author

saethlin commented Oct 5, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 5, 2025
Make inliner cycle detection a fallible process
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 5, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link
Contributor

rust-bors bot commented Oct 5, 2025

☀️ Try build successful (CI)
Build commit: 415a8b5 (415a8b5798d316514c13d80ab58b90dd5ff7785f, parent: 227ac7c3cd486872d5c2352b3df02b571500e53a)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (415a8b5): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.3% [0.3%, 0.3%] 1
Regressions ❌
(secondary)
0.5% [0.1%, 1.4%] 17
Improvements ✅
(primary)
-1.2% [-2.3%, -0.7%] 4
Improvements ✅
(secondary)
-0.6% [-5.0%, -0.1%] 30
All ❌✅ (primary) -0.9% [-2.3%, 0.3%] 5

Max RSS (memory usage)

Results (primary -4.2%, secondary -3.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.7% [0.9%, 4.8%] 3
Improvements ✅
(primary)
-4.2% [-4.2%, -4.2%] 1
Improvements ✅
(secondary)
-4.3% [-6.6%, -1.7%] 13
All ❌✅ (primary) -4.2% [-4.2%, -4.2%] 1

Cycles

Results (secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.3% [2.1%, 4.3%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.1% [-4.8%, -1.1%] 4
All ❌✅ (primary) - - 0

Binary size

Results (primary -0.0%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 1.1%] 4
Regressions ❌
(secondary)
0.5% [0.0%, 1.1%] 2
Improvements ✅
(primary)
-0.1% [-0.3%, -0.0%] 9
Improvements ✅
(secondary)
-0.1% [-0.6%, -0.0%] 14
All ❌✅ (primary) -0.0% [-0.3%, 1.1%] 13

Bootstrap: 470.078s -> 463.689s (-1.36%)
Artifact size: 388.29 MiB -> 387.90 MiB (-0.10%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Oct 5, 2025
@saethlin saethlin force-pushed the fallible-cycle-detection branch from 09c8d23 to fe71247 Compare October 5, 2025 18:43
@cjgillot
Copy link
Contributor

cjgillot commented Oct 7, 2025

Would there be a way to reverse the logic? Make the query return the set of instances that are definitely fine to inline?

@saethlin
Copy link
Member Author

saethlin commented Oct 7, 2025

Yeah I think the query could do that. We should probably return the set of recursive instances if we've finished the walk, because that is almost certainly smaller.

@cjgillot
Copy link
Contributor

cjgillot commented Oct 8, 2025

Actually, I'm not sure it would be better. One benefit would be to keep inlining some callees even if we reach recursion limit, but that may be a little too brittle to implement.

@saethlin
Copy link
Member Author

@bors try jobs=x86_64-rust-for-linux,test-various

@rust-bors
Copy link
Contributor

rust-bors bot commented Oct 13, 2025

🔒 Merge conflict

This pull request and the base branch diverged in a way that cannot
be automatically merged. Please rebase on top of the latest base
branch, and let the reviewer approve again.

How do I rebase?

Assuming self is your fork and upstream is this repository,
you can resolve the conflict following these steps:

  1. git checkout fallible-cycle-detection (switch to your branch)
  2. git fetch upstream HEAD (retrieve the latest base branch)
  3. git rebase upstream/HEAD -p (rebase on top of it)
  4. Follow the on-screen instruction to resolve conflicts (check git status if you got lost).
  5. git push self fallible-cycle-detection --force-with-lease (update this PR)

You may also read
Git Rebasing to Resolve Conflicts by Drew Blessing
for a short tutorial.

Please avoid the "Resolve conflicts" button on GitHub.
It uses git merge instead of git rebase which makes the PR commit history more difficult to read.

Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Cargo.lock conflict is
handled during merge and rebase. This is normal, and you should still perform step 5 to update this PR.

@saethlin saethlin force-pushed the fallible-cycle-detection branch from fe71247 to 0c90884 Compare October 13, 2025 23:12
@saethlin
Copy link
Member Author

@bors try jobs=x86_64-rust-for-linux,test-various

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 14, 2025
Make inliner cycle detection a fallible process

try-job: x86_64-rust-for-linux
try-job: test-various
@rust-bors
Copy link
Contributor

rust-bors bot commented Oct 14, 2025

☀️ Try build successful (CI)
Build commit: 48a4553 (48a4553529fd9a75e8655da7cd24adf2932bfc07, parent: 4b94758d2ba7d0ef71ccf5fde29ce4bc5d6fe2a4)

@saethlin saethlin force-pushed the fallible-cycle-detection branch from 0c90884 to 74e0a9a Compare December 31, 2025 03:05
@saethlin saethlin added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 31, 2025
@saethlin saethlin marked this pull request as ready for review December 31, 2025 03:08
@rustbot
Copy link
Collaborator

rustbot commented Dec 31, 2025

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@rustbot
Copy link
Collaborator

rustbot commented Dec 31, 2025

r? @SparrowLii

rustbot has assigned @SparrowLii.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@saethlin
Copy link
Member Author

r? compiler

@rustbot rustbot assigned lcnr and unassigned SparrowLii Dec 31, 2025
@saethlin
Copy link
Member Author

Whoops, I raced with rustbot.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 31, 2025

r? @oli-obk

@bors r+

@bors
Copy link
Collaborator

bors commented Dec 31, 2025

📌 Commit cee7f5e has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 31, 2025
@bors
Copy link
Collaborator

bors commented Dec 31, 2025

⌛ Testing commit cee7f5e with merge 2848c2e...

@bors
Copy link
Collaborator

bors commented Dec 31, 2025

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 2848c2e to main...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 31, 2025
@bors bors merged commit 2848c2e into rust-lang:main Dec 31, 2025
12 checks passed
@rustbot rustbot added this to the 1.94.0 milestone Dec 31, 2025
@github-actions
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 629b092 (parent) -> 2848c2e (this PR)

Test differences

Show 88 test diffs

Stage 1

  • [mir-opt] tests/mir-opt/inline/recursion_limit_prevents_cycle_discovery.rs: [missing] -> pass (J0)

Stage 2

  • [mir-opt] tests/mir-opt/inline/recursion_limit_prevents_cycle_discovery.rs: [missing] -> pass (J1)

Additionally, 86 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 2848c2ebe9a8a604cd63455263299d7258bc8252 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-aarch64-apple: 8224.1s -> 6210.5s (-24.5%)
  2. pr-check-1: 1677.0s -> 1921.7s (+14.6%)
  3. x86_64-gnu-gcc: 3060.6s -> 3438.7s (+12.4%)
  4. aarch64-gnu-debug: 3851.6s -> 4307.9s (+11.8%)
  5. i686-gnu-nopt-1: 7387.9s -> 8254.3s (+11.7%)
  6. x86_64-gnu-tools: 3241.3s -> 3608.5s (+11.3%)
  7. test-various: 6654.9s -> 7387.1s (+11.0%)
  8. i686-gnu-2: 5319.3s -> 5889.9s (+10.7%)
  9. x86_64-gnu-aux: 7082.3s -> 7814.7s (+10.3%)
  10. x86_64-gnu-miri: 4454.2s -> 4903.7s (+10.1%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (2848c2e): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.6% [0.3%, 1.0%] 4
Regressions ❌
(secondary)
0.8% [0.1%, 1.3%] 6
Improvements ✅
(primary)
-1.0% [-4.9%, -0.1%] 8
Improvements ✅
(secondary)
-0.6% [-6.5%, -0.1%] 18
All ❌✅ (primary) -0.5% [-4.9%, 1.0%] 12

Max RSS (memory usage)

Results (secondary -1.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.1% [-1.1%, -1.1%] 1
All ❌✅ (primary) - - 0

Cycles

Results (primary 5.2%, secondary 13.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
6.1% [2.0%, 13.2%] 16
Regressions ❌
(secondary)
15.9% [2.0%, 30.1%] 14
Improvements ✅
(primary)
-2.1% [-2.2%, -2.1%] 2
Improvements ✅
(secondary)
-3.8% [-4.6%, -3.1%] 2
All ❌✅ (primary) 5.2% [-2.2%, 13.2%] 18

Binary size

Results (primary 0.2%, secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.6% [0.0%, 1.3%] 3
Regressions ❌
(secondary)
0.7% [0.1%, 1.3%] 2
Improvements ✅
(primary)
-0.3% [-0.5%, -0.1%] 3
Improvements ✅
(secondary)
-0.5% [-0.8%, -0.2%] 2
All ❌✅ (primary) 0.2% [-0.5%, 1.3%] 6

Bootstrap: 479.804s -> 473.231s (-1.37%)
Artifact size: 390.89 MiB -> 390.81 MiB (-0.02%)

@Kobzol
Copy link
Member

Kobzol commented Jan 6, 2026

The nalgebra win looks real. This changes inlining behavior, so that always perturbs compile times.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants