Skip to content

Add cross-process AOT autograd cache hit test#177397

Closed
frgossen wants to merge 5 commits intogh/frgossen/9/basefrom
gh/frgossen/9/head
Closed

Add cross-process AOT autograd cache hit test#177397
frgossen wants to merge 5 commits intogh/frgossen/9/basefrom
gh/frgossen/9/head

Conversation

Test that the AOT autograd cache persists across separate processes by
running two subprocesses with a shared cache directory and verifying the
second one gets a cache hit.

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 13, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177397

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 7a1e4f8 with merge base 6a461fe (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

frgossen added a commit that referenced this pull request Mar 13, 2026
Test that the AOT autograd cache persists across separate processes by
running two subprocesses with a shared cache directory and verifying the
second one gets a cache hit.

ghstack-source-id: d69ed07
Pull Request resolved: #177397
@frgossen frgossen requested review from aorenste and zou3519 March 13, 2026 17:03
Test that the AOT autograd cache persists across separate processes by
running two subprocesses with a shared cache directory and verifying the
second one gets a cache hit.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo

[ghstack-poisoned]
frgossen added a commit that referenced this pull request Mar 13, 2026
Test that the AOT autograd cache persists across separate processes by
running two subprocesses with a shared cache directory and verifying the
second one gets a cache hit.

ghstack-source-id: dda4c48
Pull Request resolved: #177397
@Lucaskabela Lucaskabela requested a review from zhxchen17 March 13, 2026 22:24
@Lucaskabela
Copy link
Contributor

Adding @zhxchen17 as well since he does quite a bit of work on AOT Autograd caching

Test that the AOT autograd cache persists across separate processes by
running two subprocesses with a shared cache directory and verifying the
second one gets a cache hit.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo

[ghstack-poisoned]
Test that the AOT autograd cache persists across separate processes by
running two subprocesses with a shared cache directory and verifying the
second one gets a cache hit.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo

[ghstack-poisoned]
Test that the AOT autograd cache persists across separate processes by
running two subprocesses with a shared cache directory and verifying the
second one gets a cache hit.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo

[ghstack-poisoned]
@frgossen frgossen added module: aotdispatch umbrella label for AOTAutograd issues and removed topic: not user facing topic category labels Mar 17, 2026
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 18, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@frgossen frgossen added the topic: not user facing topic category label Mar 18, 2026
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #177428

@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #177428

2 similar comments
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #177428

@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #177428

@frgossen
Copy link
Contributor Author

@pytorchbot merge -i

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 18, 2026
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged while ignoring the following 1 checks: pull / linux-jammy-cpu-py3.10-gcc11-bazel-test / build-and-test (default, 1, 1, lf.linux.4xlarge)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Mar 18, 2026
pre_grad_custom_pass was the only custom pass config without UUID-based
cache key integration. It was excluded from config serialization but not
handled specially via UUID extraction, so its effect was only captured
indirectly through the resulting FX graph. This meant two different
passes producing the same graph could incorrectly share a cache entry.

Align pre_grad_custom_pass with post-grad and joint passes: change its
type to CustomGraphPassType, add it to _cache_config_ignore_prefix (so
the UUID is extracted explicitly), include it in FxGraphHashDetails, and
validate it in _check_can_cache.

Pull Request resolved: #177403
Approved by: https://github.com/aorenste, https://github.com/zou3519
ghstack dependencies: #177397
pytorchmergebot pushed a commit that referenced this pull request Mar 18, 2026
pytorchmergebot pushed a commit that referenced this pull request Mar 18, 2026
…77429)

Add a pre_grad_pass_timing config ("early", "late", or "default") that
controls when pre-grad passes run relative to the AOT autograd cache lookup.

- "early": passes run before cache lookup, so they execute on every compile
(including cache hits) and the cache key reflects the already-transformed
graph.
- "late": passes run after cache lookup (only on cache miss); requires
custom passes to provide a UUID for the cache key.
- "default": automatically resolves to "late" when possible (no custom pass,
or a custom pass with a UUID), and falls back to "early" when the custom
pass has no UUID.

Explicitly setting "late" with a UUID-less custom pass now raises a
RuntimeError instead of silently bypassing the cache. The existing
test_pre_grad_passes_called_on_cache_miss_only test is renamed and
pinned to "late" timing, and new tests cover early timing, both default
timing branches, and the error case.

Pull Request resolved: #177429
Approved by: https://github.com/aorenste, https://github.com/zou3519
ghstack dependencies: #177397, #177403, #177428
ryanzhang22 pushed a commit to ryanzhang22/pytorch that referenced this pull request Mar 19, 2026
Test that the AOT autograd cache persists across separate processes by
running two subprocesses with a shared cache directory and verifying the
second one gets a cache hit.

Pull Request resolved: pytorch#177397
Approved by: https://github.com/aorenste, https://github.com/zou3519
ryanzhang22 pushed a commit to ryanzhang22/pytorch that referenced this pull request Mar 19, 2026
…177403)

pre_grad_custom_pass was the only custom pass config without UUID-based
cache key integration. It was excluded from config serialization but not
handled specially via UUID extraction, so its effect was only captured
indirectly through the resulting FX graph. This meant two different
passes producing the same graph could incorrectly share a cache entry.

Align pre_grad_custom_pass with post-grad and joint passes: change its
type to CustomGraphPassType, add it to _cache_config_ignore_prefix (so
the UUID is extracted explicitly), include it in FxGraphHashDetails, and
validate it in _check_can_cache.

Pull Request resolved: pytorch#177403
Approved by: https://github.com/aorenste, https://github.com/zou3519
ghstack dependencies: pytorch#177397
ryanzhang22 pushed a commit to ryanzhang22/pytorch that referenced this pull request Mar 19, 2026
ryanzhang22 pushed a commit to ryanzhang22/pytorch that referenced this pull request Mar 19, 2026
…torch#177429)

Add a pre_grad_pass_timing config ("early", "late", or "default") that
controls when pre-grad passes run relative to the AOT autograd cache lookup.

- "early": passes run before cache lookup, so they execute on every compile
(including cache hits) and the cache key reflects the already-transformed
graph.
- "late": passes run after cache lookup (only on cache miss); requires
custom passes to provide a UUID for the cache key.
- "default": automatically resolves to "late" when possible (no custom pass,
or a custom pass with a UUID), and falls back to "early" when the custom
pass has no UUID.

Explicitly setting "late" with a UUID-less custom pass now raises a
RuntimeError instead of silently bypassing the cache. The existing
test_pre_grad_passes_called_on_cache_miss_only test is renamed and
pinned to "late" timing, and new tests cover early timing, both default
timing branches, and the error case.

Pull Request resolved: pytorch#177429
Approved by: https://github.com/aorenste, https://github.com/zou3519
ghstack dependencies: pytorch#177397, pytorch#177403, pytorch#177428
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged module: aotdispatch umbrella label for AOTAutograd issues module: dynamo topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants