Add pre_grad_pass_timing config for early vs late pre-grad passes by frgossen · Pull Request #177429 · pytorch/pytorch

frgossen · 2026-03-13T22:03:44Z

Stack from ghstack (oldest at bottom):

Add a pre_grad_pass_timing config ("early", "late", or "default") that
controls when pre-grad passes run relative to the AOT autograd cache lookup.

"early": passes run before cache lookup, so they execute on every compile
(including cache hits) and the cache key reflects the already-transformed
graph.
"late": passes run after cache lookup (only on cache miss); requires
custom passes to provide a UUID for the cache key.
"default": automatically resolves to "late" when possible (no custom pass,
or a custom pass with a UUID), and falls back to "early" when the custom
pass has no UUID.

Explicitly setting "late" with a UUID-less custom pass now raises a
RuntimeError instead of silently bypassing the cache. The existing
test_pre_grad_passes_called_on_cache_miss_only test is renamed and
pinned to "late" timing, and new tests cover early timing, both default
timing branches, and the error case.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo @Lucaskabela

Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. [ghstack-poisoned]

pytorch-bot · 2026-03-13T22:03:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177429

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit 67ad41f with merge base 6a461fe ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot · 2026-03-13T22:03:53Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. ghstack-source-id: 8690d1e Pull Request resolved: #177429

aorenste

Is it missing a test for caching "early" + UUID?

Maybe minor/not worth changing: After this change, won't an "early" pass's UUID will be part of the cache key? That means that if a pass ends up being a no-op it can't share a cache entry with a run without that pass. But not including the UUID is actually safe because the cache key is based on the pre-cache graph structure.

torch/_inductor/config.py

… passes" Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Lucaskabela [ghstack-poisoned]

Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. ghstack-source-id: 0b0e8fc Pull Request resolved: #177429

… passes" Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Lucaskabela [ghstack-poisoned]

Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. ghstack-source-id: f257cef Pull Request resolved: #177429

… passes" Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Lucaskabela [ghstack-poisoned]

Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. ghstack-source-id: 21013bf Pull Request resolved: #177429

… passes" Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Lucaskabela [ghstack-poisoned]

Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. ghstack-source-id: bea0a46 Pull Request resolved: #177429

… passes" Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Lucaskabela [ghstack-poisoned]

Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. ghstack-source-id: 3705010 Pull Request resolved: #177429

frgossen · 2026-03-17T16:12:31Z

Is it missing a test for caching "early" + UUID?

Maybe minor/not worth changing: After this change, won't an "early" pass's UUID will be part of the cache key? That means that if a pass ends up being a no-op it can't share a cache entry with a run without that pass. But not including the UUID is actually safe because the cache key is based on the pre-cache graph structure.

Good point. Put the pre-grad cache key contribution behind the timing config.

… passes" Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Lucaskabela [ghstack-poisoned]

Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. ghstack-source-id: 8519814 Pull Request resolved: #177429

… passes" Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Lucaskabela [ghstack-poisoned]

frgossen · 2026-03-18T00:57:20Z

@pytorchbot merge

pytorchmergebot · 2026-03-18T00:59:40Z

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

pytorch-bot · 2026-03-18T00:59:44Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

… passes" Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Lucaskabela [ghstack-poisoned]

torch/_functorch/aot_autograd.py

torch/_inductor/codecache.py

zou3519

lgtm but please read comments

… passes" Allow pre-grad passes to run before the AOT autograd cache lookup ("early") instead of only after it on cache miss ("late", the default). With "early" timing the cache key reflects the already-transformed graph and passes always execute, even on cache hits. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo Lucaskabela [ghstack-poisoned]

frgossen · 2026-03-18T23:57:16Z

@pytorchbot merge

pytorchmergebot · 2026-03-18T23:59:18Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…torch#177429) Add a pre_grad_pass_timing config ("early", "late", or "default") that controls when pre-grad passes run relative to the AOT autograd cache lookup. - "early": passes run before cache lookup, so they execute on every compile (including cache hits) and the cache key reflects the already-transformed graph. - "late": passes run after cache lookup (only on cache miss); requires custom passes to provide a UUID for the cache key. - "default": automatically resolves to "late" when possible (no custom pass, or a custom pass with a UUID), and falls back to "early" when the custom pass has no UUID. Explicitly setting "late" with a UUID-less custom pass now raises a RuntimeError instead of silently bypassing the cache. The existing test_pre_grad_passes_called_on_cache_miss_only test is renamed and pinned to "late" timing, and new tests cover early timing, both default timing branches, and the error case. Pull Request resolved: pytorch#177429 Approved by: https://github.com/aorenste, https://github.com/zou3519 ghstack dependencies: pytorch#177397, pytorch#177403, pytorch#177428

frgossen requested review from aorenste and bdhirsh as code owners March 13, 2026 22:03

This was referenced Mar 13, 2026

Add cross-process AOT autograd cache hit test #177397

Closed

Remove unused static_inputs_log from aot_autograd.py #177428

Closed

Add UUID-based cache key support for pre-grad custom passes #177403

Closed

pytorch-bot bot added ciflow/inductor ciflow/torchtitan Run TorchTitan integration tests module: dynamo module: inductor labels Mar 13, 2026

frgossen requested review from zou3519 and removed request for bdhirsh March 13, 2026 22:04

aorenste approved these changes Mar 16, 2026

View reviewed changes

torch/_inductor/config.py Outdated Show resolved Hide resolved

frgossen mentioned this pull request Mar 17, 2026

Add DEFAULT pre_grad_pass_timing that auto-selects early vs late #177664

Closed

frgossen added the module: aotdispatch umbrella label for AOTAutograd issues label Mar 17, 2026

frgossen mentioned this pull request Mar 17, 2026

Default pre_grad_pass_timing to "late" in fbcode #177690

Open

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 18, 2026

pytorchmergebot added the merging label Mar 18, 2026

pytorchmergebot removed the merging label Mar 18, 2026

frgossen mentioned this pull request Mar 18, 2026

Consolidate pre-grad pass timing tests into parametrized test #177731

Open

This was referenced Mar 18, 2026

Parametrize cross-process pre-grad custom pass cache test #177733

Open

Add assertion that custom pre-grad pass modifies the graph #177736

Open

frgossen added the release notes: aotdispatcher label Mar 18, 2026

zou3519 reviewed Mar 18, 2026

View reviewed changes

torch/_functorch/aot_autograd.py Outdated Show resolved Hide resolved

zou3519 reviewed Mar 18, 2026

View reviewed changes

torch/_inductor/codecache.py Outdated Show resolved Hide resolved

zou3519 approved these changes Mar 18, 2026

View reviewed changes

frgossen added 2 commits March 18, 2026 12:31

pytorchmergebot added the merging label Mar 18, 2026

pytorchmergebot added the Merged label Mar 18, 2026

pytorchmergebot closed this in 2a90511 Mar 18, 2026

pytorchmergebot removed the merging label Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pre_grad_pass_timing config for early vs late pre-grad passes#177429

Add pre_grad_pass_timing config for early vs late pre-grad passes#177429
frgossen wants to merge 12 commits intogh/frgossen/12/basefrom
gh/frgossen/12/head

frgossen commented Mar 13, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 13, 2026

Uh oh!

aorenste left a comment

Uh oh!

Uh oh!

frgossen commented Mar 17, 2026

Uh oh!

frgossen commented Mar 18, 2026

Uh oh!

pytorchmergebot commented Mar 18, 2026

Uh oh!

pytorch-bot bot commented Mar 18, 2026

Uh oh!

Uh oh!

Uh oh!

zou3519 left a comment

Uh oh!

frgossen commented Mar 18, 2026

Uh oh!

pytorchmergebot commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

frgossen commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177429

⏳ No Failures, 1 Pending

Uh oh!

pytorch-bot bot commented Mar 13, 2026

This PR needs a release notes: label

Uh oh!

aorenste left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

frgossen commented Mar 17, 2026

Uh oh!

frgossen commented Mar 18, 2026

Uh oh!

pytorchmergebot commented Mar 18, 2026

Merge failed

Uh oh!

pytorch-bot bot commented Mar 18, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

frgossen commented Mar 18, 2026

Uh oh!

pytorchmergebot commented Mar 18, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

frgossen commented Mar 13, 2026 •

edited

Loading

pytorch-bot bot commented Mar 13, 2026 •

edited

Loading

This PR needs a `release notes:` label

This PR needs a `release notes:` label