[async-compile] add progressive compile mode by bobrenjc93 · Pull Request #157305 · pytorch/pytorch

bobrenjc93 · 2025-06-30T20:36:39Z

Stack from ghstack (oldest at bottom):

-> [async-compile] add progressive compile mode #157305

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

[ghstack-poisoned]

pytorch-bot · 2025-06-30T20:36:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/157305

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 5392190 with merge base c808af5 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / macos-py3-arm64 / test (default, 3, 3, macos-m1-stable) (gh) (disabled by #148644 but the issue was closed recently and a rebase is needed to make it pass)
dynamo/test_decorators.py::DecoratorTests::test_set_stance_aot_eager_then_compile

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / cuda12.8-py3.10-gcc9-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu, unstable) (gh) (#153987)
MISSING REGRESSION TEST

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: 03ce1e8 Pull-Request-resolved: #157305

[ghstack-poisoned]

ghstack-source-id: b4e227e Pull-Request-resolved: #157305

test/inductor/test_compile_subprocess.py

torch/_inductor/compile_fx.py

torch/_inductor/compile_fx_async.py

[ghstack-poisoned]

aorenste

Future test needed: We need to ensure that if an optimization is running it doesn't block the app from exiting - I think I saw issues with this w/ subprocess and also RE

aorenste · 2025-07-03T18:50:36Z

torch/_inductor/compile_fx_async.py

+        ):
+            future = self._progression_futures[i]
+            if future and future.done():
+                self._switch_to_progression_stage(i)


future nit: if we used a deque for _progression_futures then we could pop the finished ones off the front and then we wouldn't even have to check for None.

That would mean we are forced to linearize the progressions though right? Especially in a world with RE and action cache it seems like it'd be ideal to "skip" some futures if they are slow.

eg. imagine

compile 1 (expected 10 min, actual 10 min)
compile 2 (expected 20 min, actual 60 min due to queuing)
compile 3 (expected 20 min, actual 20s due to action caching)

We would want compile 3 to kick in before compile 2.

I would think there should be no difference between using a deque and setting earlier ones to None (I'm assuming you can scan the deque like a list and not just pop_front)

will factor out in separate pr

aorenste · 2025-07-03T18:53:45Z

torch/_inductor/compile_fx_async.py

+        if pcd := self._post_compile_data:
+            # Only clear post_compile_data if this is the final progression stage
+            if stage_index == len(self._progression_futures) - 1:
+                self._post_compile_data = None


nit: could clear _callback too, I think

i'll factor out the refactor into a dataclass as discussed offline in a separate pr

torch/_inductor/compile_fx_async.py

aorenste · 2025-07-03T19:02:15Z

torch/_inductor/compile_fx_async.py

+        graph_kwargs: _CompileFxKwargs,
+    ) -> None:
+        if self._optimized_output_code is not None:
+            self._optimized_output_code.post_compile(


Hm. What if there's still a future pending? This should probably be based on pending futures and not fast vs optimized.

Actually I fixed this to be easier to reason about

First post compile will always be on fast output code, that will populate _post_compile_data

only after _post_compile_data is populated will we begin the progressions

torch/_inductor/compile_fx_async.py

[ghstack-poisoned]

ghstack-source-id: 3bf6895 Pull-Request-resolved: #157305

[ghstack-poisoned]

ghstack-source-id: abbc65b Pull-Request-resolved: #157305

aorenste · 2025-07-03T20:11:44Z

torch/_inductor/compile_fx_async.py

+        constants: CompiledFxGraphConstants,
+        graph_kwargs: _CompileFxKwargs,
+    ) -> None:
+        assert self._fast_output_code is not None


This worries me - but with an assert at least it will be obvious if it fails...

bobrenjc93 · 2025-07-04T04:11:18Z

@pytorchbot merge

pytorchmergebot · 2025-07-04T04:13:08Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Pull Request resolved: #157614 Approved by: https://github.com/aorenste ghstack dependencies: #157305

@aorenste

followup from #157305 where @aorenste correctly suggested clearing callback. this refactor introduces a new dataclass so we don't need to check nullability for each field Pull Request resolved: #157619 Approved by: https://github.com/aorenste ghstack dependencies: #157305, #157614

Pull Request resolved: #157650 Approved by: https://github.com/aorenste ghstack dependencies: #157305, #157614, #157619

Update

2f3688e

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: inductor labels Jun 30, 2025

Update

1364df6

[ghstack-poisoned]

bobrenjc93 added a commit that referenced this pull request Jul 1, 2025

[br][pc] attempt 1

f0150dd

ghstack-source-id: 03ce1e8 Pull-Request-resolved: #157305

Update

31ab656

[ghstack-poisoned]

bobrenjc93 added a commit that referenced this pull request Jul 1, 2025

[br][pc] attempt 1

9600283

ghstack-source-id: b4e227e Pull-Request-resolved: #157305

bobrenjc93 changed the title ~~[br][pc] attempt 1~~ [async-compile] add progressive compile mode Jul 1, 2025

bobrenjc93 added the topic: not user facing topic category label Jul 1, 2025

bobrenjc93 commented Jul 1, 2025

View reviewed changes

test/inductor/test_compile_subprocess.py Outdated Show resolved Hide resolved

bobrenjc93 marked this pull request as ready for review July 1, 2025 20:36

bobrenjc93 requested a review from aorenste July 1, 2025 20:48

aorenste reviewed Jul 1, 2025

View reviewed changes

This was referenced Jul 2, 2025

comments #157424

Closed

[wip] inspect output code #157508

Closed

[wip] merge async and progressive #157510

Closed

Update

8fbda6f

[ghstack-poisoned]

aorenste reviewed Jul 3, 2025

View reviewed changes

Update

73a7ef0

[ghstack-poisoned]

bobrenjc93 added a commit that referenced this pull request Jul 3, 2025

[br][pc] attempt 1

a5b24ef

ghstack-source-id: 3bf6895 Pull-Request-resolved: #157305

Update

5392190

[ghstack-poisoned]

bobrenjc93 added a commit that referenced this pull request Jul 3, 2025

[br][pc] attempt 1

75803eb

ghstack-source-id: abbc65b Pull-Request-resolved: #157305

aorenste approved these changes Jul 3, 2025

View reviewed changes

bobrenjc93 added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 3, 2025

pytorchmergebot added the merging label Jul 4, 2025

pytorchmergebot closed this in d58ed04 Jul 4, 2025

pytorchmergebot added Merged and removed merging labels Jul 4, 2025

sawaraken bot mentioned this pull request Jul 4, 2025

PyTorch Introduces Progressive Compilation Mode for Asynchronous Compilation / PyTorch、非同期コンパイルにプログレッシブコンパイルモードを追加 xhiroga/news#854

Open

bobrenjc93 mentioned this pull request Jul 4, 2025

[pc] introduce ProgressiveCompilationState and clear callback #157619

Closed

pytorchmergebot pushed a commit that referenced this pull request Jul 5, 2025

[pc] migrate progression futures from list to deque (#157614)

5ea832e

Pull Request resolved: #157614 Approved by: https://github.com/aorenste ghstack dependencies: #157305

pytorchmergebot pushed a commit that referenced this pull request Jul 5, 2025

[pc] verify max autotune is in generated source code (#157650)

2471cc3

Pull Request resolved: #157650 Approved by: https://github.com/aorenste ghstack dependencies: #157305, #157614, #157619

This was referenced Jul 5, 2025

[pruning] feat : Taylor expansion unstructured pruning #157620

Closed

[pruning] add more test cases for pruning #157613

Closed

github-actions bot deleted the gh/bobrenjc93/504/head branch August 4, 2025 02:21

Conversation

bobrenjc93 commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/157305

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aorenste left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bobrenjc93 commented Jul 4, 2025

Uh oh!

pytorchmergebot commented Jul 4, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bobrenjc93 commented Jun 30, 2025 •

edited

Loading

pytorch-bot bot commented Jun 30, 2025 •

edited

Loading