Skip to content

[dynamo] disable dynamo recursively on compiled code if fullgraph=True using eval frame overrides#173080

Closed
williamwen42 wants to merge 5 commits intogh/williamwen42/381/basefrom
gh/williamwen42/381/head
Closed

[dynamo] disable dynamo recursively on compiled code if fullgraph=True using eval frame overrides#173080
williamwen42 wants to merge 5 commits intogh/williamwen42/381/basefrom
gh/williamwen42/381/head

Conversation

@williamwen42
Copy link
Member

@williamwen42 williamwen42 commented Jan 22, 2026

Stack from ghstack (oldest at bottom):

This is attempt #2 at #172295.

Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are:

  • skip tracing frames
  • error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though.
    Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @chauhang @amjames @Lucaskabela @jataylo @mlazos

…e using eval frame overrides

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Jan 22, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/173080

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ You can merge normally! (2 Unrelated Failures)

As of commit 745f7ae with merge base e5541c2 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

williamwen42 added a commit that referenced this pull request Jan 22, 2026
…e using eval frame overrides

ghstack-source-id: 551e7ce
Pull Request resolved: #173080
…llgraph=True using eval frame overrides"


This is attempt #2 at #172295. 

Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are:
- skip tracing frames
- error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though.
Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo mlazos

[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Jan 22, 2026
…e using eval frame overrides

ghstack-source-id: 72881fe
Pull Request resolved: #173080
…llgraph=True using eval frame overrides"


This is attempt #2 at #172295. 

Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are:
- skip tracing frames
- error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though.
Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo mlazos

[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Jan 22, 2026
…e using eval frame overrides

ghstack-source-id: 23e9629
Pull Request resolved: #173080
…llgraph=True using eval frame overrides"


This is attempt #2 at #172295. 

Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are:
- skip tracing frames
- error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though.
Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo mlazos

[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Jan 23, 2026
…e using eval frame overrides

ghstack-source-id: 2ff4d04
Pull Request resolved: #173080
def _get_eval_frame_override() -> _EvalFrameOverride:
if torch._dynamo.config.error_on_dynamo_callback_in_fullgraph_compiled_code:
return _EvalFrameOverride.ERROR
return _EvalFrameOverride.SKIP
Copy link
Contributor

@anijain2305 anijain2305 Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed the default should be SKIP.

Arguably, we should have ERROR as the default because if Dynamo traces its post-graph bytecode, it should be categories as BUG.

But practically, there can be post-graph bytecode out there, which will be suddenly be categorized as BUG, even though they might be perfectly fine from runtime perspective (dynamo traces them, does not find anything useful, and therefore the resulting bytecode is still ok).

Due to BC-ish break (not sure if this is BC or not), I think SKIP is a better default.

…llgraph=True using eval frame overrides"


This is attempt #2 at #172295. 

Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are:
- skip tracing frames
- error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though.
Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo mlazos

[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Jan 28, 2026
…e using eval frame overrides

ghstack-source-id: 921bc09
Pull Request resolved: #173080
@williamwen42 williamwen42 requested a review from ezyang January 28, 2026 18:56
@williamwen42
Copy link
Member Author

fyi @ezyang dynamo will now skip tracing recursively when running fullgraph=True compiled code, with the option to error out if a trace is attempted.

@williamwen42
Copy link
Member Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 2, 2026
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

radeksm pushed a commit to radeksm/pytorch that referenced this pull request Feb 20, 2026
…e using eval frame overrides (pytorch#173080)

This is attempt pytorch#2 at pytorch#172295.

Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are:
- skip tracing frames
- error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though.
Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True.

Pull Request resolved: pytorch#173080
Approved by: https://github.com/anijain2305
libohao1201 pushed a commit to libohao1201/pytorch that referenced this pull request Mar 2, 2026
…e using eval frame overrides (pytorch#173080)

This is attempt pytorch#2 at pytorch#172295.

Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are:
- skip tracing frames
- error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though.
Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True.

Pull Request resolved: pytorch#173080
Approved by: https://github.com/anijain2305
@github-actions github-actions bot deleted the gh/williamwen42/381/head branch March 5, 2026 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants