[Dynamo] Imporve-graph-break-skip-logs#167067
[Dynamo] Imporve-graph-break-skip-logs#167067parsshar-RH wants to merge 5 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167067
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit dd201aa with merge base c940b1f ( UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
cc @williamwen42 to help take a look at this |
williamwen42
left a comment
There was a problem hiding this comment.
Thanks for submitting! I made a few comments.
test/dynamo/test_error_messages.py
Outdated
| ) | ||
|
|
||
| @make_logging_test(graph_breaks=True) | ||
| def test_skipped_frame_no_verbose_traceback(self, records): |
There was a problem hiding this comment.
How does this test differ from test_assert_failure_in_generic_ctx_mgr?
test/dynamo/test_error_messages.py
Outdated
| def test_skipped_frame_no_verbose_traceback(self, records): | ||
| def fn(x): | ||
| with GenericCtxMgr(): | ||
| assert x is None |
There was a problem hiding this comment.
Could we just do a regular graph break here? Since there's some special handling of asserts that causes such a test to possibly fail internally (see comment above test_assert_failure_in_generic_ctx_mgr)
test/dynamo/test_error_messages.py
Outdated
|
|
||
| self.assertEqual(len(records), 1) | ||
|
|
||
| message = records[0].getMessage() |
There was a problem hiding this comment.
Prefer to use assertExpectedInline[Munged] here in order to more clearly see what the error message looks like (otherwise, can you provide a reason why this would not be a good idea?)
torch/_dynamo/convert_frame.py
Outdated
| if config.verbose: | ||
| graph_break_log.debug( | ||
| user_stack_trace, | ||
| exc_info=True, | ||
| stack_info=True, | ||
| ) | ||
| else: | ||
| graph_break_log.debug( | ||
| user_stack_trace, | ||
| exc_info=True, | ||
| stack_info=False, | ||
| ) |
There was a problem hiding this comment.
| if config.verbose: | |
| graph_break_log.debug( | |
| user_stack_trace, | |
| exc_info=True, | |
| stack_info=True, | |
| ) | |
| else: | |
| graph_break_log.debug( | |
| user_stack_trace, | |
| exc_info=True, | |
| stack_info=False, | |
| ) | |
| graph_break_log.debug( | |
| user_stack_trace, | |
| exc_info=True, | |
| stack_info=config.verbose, | |
| ) |
torch/_dynamo/symbolic_convert.py
Outdated
| frame_info = ( | ||
| f"{getattr(self.f_code, 'co_name', '<unknown>')} " | ||
| f"({getattr(self.f_code, 'co_filename', '<unknown>')} " | ||
| f"line {getattr(self.f_code, 'co_firstlineno', 0)})" | ||
| ) | ||
| msg = ( | ||
| "Skipping frame because there is a graph break in a for/while loop\n" | ||
| f"torch.compile intentionally decided to skip the frame {frame_info} and fall back to eager.\n" | ||
| f"Reason: Skipping frame because there is a graph break in a for/while loop.\n" |
There was a problem hiding this comment.
Can we refactor this code (and the above) into one function?
torch/_dynamo/symbolic_convert.py
Outdated
| f"{getattr(self.f_code, 'co_name', '<unknown>')} " | ||
| f"({getattr(self.f_code, 'co_filename', '<unknown>')} " | ||
| f"line {getattr(self.f_code, 'co_firstlineno', 0)})" |
There was a problem hiding this comment.
I'm seeing this code appear a lot - can we move it to a single function in exc.py?
|
|
||
| result = torch.compile(f2, backend="eager")(torch.randn(3)) | ||
| expected = f2(torch.randn(3)) | ||
| self.assertEqual(result.shape, expected.shape) |
There was a problem hiding this comment.
Can you check for UX here (i.e. using assertExpectedInline[Munged])? And same for the added cases below.
|
Thank you for the detailed review and valuable suggestions. Thanks again for your time and feedback. |
|
I have implemented the suggested changes. Thanks! |
torch/_dynamo/utils.py
Outdated
| functorch_subclass_name = re.sub(r"\(.*", "", repr(val)) | ||
| raise SkipFrame( | ||
| f"torch.compile cannot be run in context: {functorch_subclass_name}" | ||
| f"torch.compile intentionally decided to skip the frame and fall back to eager.\n" |
There was a problem hiding this comment.
Use format_skip_frame_message?
torch/_dynamo/variables/functions.py
Outdated
| raise SkipFrame( | ||
| f"Skip frame due to `torch._dynamo.skip_frame()`. Message: {skip_frame_msg}" | ||
| frame_info = format_frame_info(tx.f_code) | ||
| msg = ( |
There was a problem hiding this comment.
Use format_skip_frame_message?
torch/_dynamo/exc.py
Outdated
| def format_skip_frame_message(code: types.CodeType, reason: str) -> str: | ||
| frame_info = format_frame_info(code) | ||
| return ( | ||
| f"torch.compile intentionally decided to skip the frame {frame_info} and fall back to eager.\n" |
There was a problem hiding this comment.
I don't see a test that checks for this string - I think this might not be logged to the graph_breaks logging artifact (I believe it is logged to the torch._dynamo logger?). Can you make sure this is covered by the tests you added?
|
I have implemented the suggested changes. Thanks! |
|
@williamwen42 Thanks in advance! |
williamwen42
left a comment
There was a problem hiding this comment.
The lint error looks real - can you move the nested graph break tests you added to test_error_messages?
|
I have moved the nested graph break tests to test_error_messages.py |
|
Lint error is resolved. Thanks |
|
@williamwen42 |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / linux-jammy-cuda12.8-py3.10-gcc11 / test (default, 4, 5, lf.linux.g6.4xlarge.experimental.nvidia.gpu) Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
683d9ec to
dd201aa
Compare
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 mandatory check(s) failed. The first few are: Dig deeper by viewing the failures on hud |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Summary: Fixes #150477 ### Summary: - Added frame information (function name, file, line number) to all graph break/skip messages - Standardized message format: "torch.compile will skip tracing the frame <name> (<file> line <N>) and fall back to eager. Reason: <reason>" ### Impacts: module: dynamo X-link: pytorch/pytorch#167067 Approved by: https://github.com/williamwen42 Reviewed By: jeanschmidt Differential Revision: D87036500 fbshipit-source-id: 62281bad4609b8ea3557f7139695678bed0679cb
Fixes pytorch#150477 ### Summary: - Added frame information (function name, file, line number) to all graph break/skip messages - Standardized message format: "torch.compile will skip tracing the frame <name> (<file> line <N>) and fall back to eager. Reason: <reason>" ### Impacts: module: dynamo Pull Request resolved: pytorch#167067 Approved by: https://github.com/williamwen42
Fixes #150477
Summary:
Impacts:
module: dynamo
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @Lucaskabela