[export] avoid RecursionError in guards-fn codegen for deeply nested guards (#186993) by kqfu · Pull Request #186993 · pytorch/pytorch

kqfu · 2026-06-10T23:02:11Z

Summary:

ExportedProgram.module() builds a _guards_fn submodule that re-asserts the exported shape guards. For each assert's human-readable error message, _convert_guards_code_to_fn (in torch/export/_unlift.py) pretty-prints the guard via ast.unparse(ast.parse(shadow)). Both ast.parse and ast.unparse recurse once per AST node, so a guard whose expression is very deeply nested -- e.g. a sum over many symbolic sizes, as produced when exporting a recommendation model with auto_dynamic_shapes over hundreds of jagged/KJT features -- exceeds Python's recursion limit and raises RecursionError, aborting the entire export (including standalone publish, which reaches this code via run_decompositions() -> module()).

Root cause: the ast.unparse(ast.parse(...)) round-trip is purely cosmetic; as the existing comment states, it "is not necessary for correctness, just deemed desirable" -- it only normalizes redundant parentheses in the assert error string. The executed runtime check uses the separate actual expression and does not depend on the pretty-printed shadow, so a deep guard should never be fatal.

Fix: wrap the normalization in try/except RecursionError and fall back to the un-normalized guard string. The emitted runtime assert is unchanged; only the readability of the guard-failure message degrades slightly in the rare deep-guard case.

Test Plan:
Built custom aps package and publish f1096406197

Added test_guards_fn_recovers_from_unparse_recursion_error, which mocks ast.unparse to raise RecursionError and asserts _convert_guards_code_to_fn still returns a guards fn instead of propagating the error. A mock is used rather than a genuinely deep expression because the test target is ASAN-instrumented, where deep ast.parse/compile recursion can abort the process before the pure-Python RecursionError is reached.

buck2 test fbcode//caffe2/test:test_export -- --regex 'test_guards_fn_recovers_from_unparse_recursion_error'

After the fix: Pass 11. Fail 0. Fatal 0. (the test is fanned out across export modes: strict, nonstrict, serdes, retraceability, cpp_serdes, training_ir, nativert, ...). Before the fix the same test fails with RecursionError: maximum recursion depth exceeded at _unlift.py (Pass 0, Fail 11).

Authored with the assistance of an AI coding assistant.

Reviewed By: jijunyan, sophielin508

Differential Revision: D108111211

pytorch-bot · 2026-06-10T23:02:16Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/186993

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 6 Pending

As of commit 029b577 with merge base 1979118 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-06-10T23:02:20Z

@kqfu has exported this pull request. If you are a Meta employee, you can view the originating Diff in D108111211.

…guards (pytorch#186993) Summary: `ExportedProgram.module()` builds a `_guards_fn` submodule that re-asserts the exported shape guards. For each assert's human-readable error message, `_convert_guards_code_to_fn` (in `torch/export/_unlift.py`) pretty-prints the guard via `ast.unparse(ast.parse(shadow))`. Both `ast.parse` and `ast.unparse` recurse once per AST node, so a guard whose expression is very deeply nested -- e.g. a sum over many symbolic sizes, as produced when exporting a recommendation model with `auto_dynamic_shapes` over hundreds of jagged/KJT features -- exceeds Python's recursion limit and raises `RecursionError`, aborting the entire export (including standalone publish, which reaches this code via `run_decompositions()` -> `module()`). Root cause: the `ast.unparse(ast.parse(...))` round-trip is purely cosmetic; as the existing comment states, it "is not necessary for correctness, just deemed desirable" -- it only normalizes redundant parentheses in the assert error string. The executed runtime check uses the separate `actual` expression and does not depend on the pretty-printed `shadow`, so a deep guard should never be fatal. Fix: wrap the normalization in `try/except RecursionError` and fall back to the un-normalized guard string. The emitted runtime assert is unchanged; only the readability of the guard-failure message degrades slightly in the rare deep-guard case. Test Plan: Built custom aps package and publish f1096406197 Added `test_guards_fn_recovers_from_unparse_recursion_error`, which mocks `ast.unparse` to raise `RecursionError` and asserts `_convert_guards_code_to_fn` still returns a guards fn instead of propagating the error. A mock is used rather than a genuinely deep expression because the test target is ASAN-instrumented, where deep `ast.parse`/`compile` recursion can abort the process before the pure-Python `RecursionError` is reached. ``` buck2 test fbcode//caffe2/test:test_export -- --regex 'test_guards_fn_recovers_from_unparse_recursion_error' ``` After the fix: `Pass 11. Fail 0. Fatal 0.` (the test is fanned out across export modes: strict, nonstrict, serdes, retraceability, cpp_serdes, training_ir, nativert, ...). Before the fix the same test fails with `RecursionError: maximum recursion depth exceeded` at `_unlift.py` (`Pass 0, Fail 11`). Authored with the assistance of an AI coding assistant. Reviewed By: jijunyan, sophielin508 Differential Revision: D108111211

jijunyan

LGTM

meta-codesync · 2026-06-12T01:16:00Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2026-06-12T01:18:24Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

kqfu requested review from angelayi, avikchaudhuri, tugsbayasgalan, ydwu4 and zhxchen17 as code owners June 10, 2026 23:02

pytorch-bot Bot added the release notes: export label Jun 10, 2026

meta-codesync Bot added the meta-exported label Jun 10, 2026

meta-codesync Bot changed the title ~~[export] avoid RecursionError in guards-fn codegen for deeply nested guards~~ [export] avoid RecursionError in guards-fn codegen for deeply nested guards (#186993) Jun 11, 2026

kqfu force-pushed the export-D108111211 branch from 65088e4 to 33c24f4 Compare June 11, 2026 16:58

kqfu force-pushed the export-D108111211 branch from 33c24f4 to 029b577 Compare June 11, 2026 16:59

jijunyan approved these changes Jun 11, 2026

View reviewed changes

pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 11, 2026

pytorchmergebot added the merging label Jun 12, 2026

pytorchmergebot closed this in 083e261 Jun 12, 2026

pytorchmergebot added Merged and removed merging labels Jun 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[export] avoid RecursionError in guards-fn codegen for deeply nested guards (#186993)#186993

[export] avoid RecursionError in guards-fn codegen for deeply nested guards (#186993)#186993
kqfu wants to merge 1 commit into
pytorch:mainfrom
kqfu:export-D108111211

kqfu commented Jun 10, 2026 •

edited by meta-codesync Bot

Loading

Uh oh!

pytorch-bot Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented Jun 10, 2026

Uh oh!

jijunyan left a comment

Uh oh!

meta-codesync Bot commented Jun 12, 2026

Uh oh!

pytorchmergebot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kqfu commented Jun 10, 2026 • edited by meta-codesync Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/186993

⏳ No Failures, 6 Pending

Uh oh!

meta-codesync Bot commented Jun 10, 2026

Uh oh!

jijunyan left a comment

Choose a reason for hiding this comment

Uh oh!

meta-codesync Bot commented Jun 12, 2026

Uh oh!

pytorchmergebot commented Jun 12, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kqfu commented Jun 10, 2026 •

edited by meta-codesync Bot

Loading

pytorch-bot Bot commented Jun 10, 2026 •

edited

Loading