Dump HLO HBM usage info by JackCaoG · Pull Request #7085 · pytorch/xla

JackCaoG · 2024-05-20T22:57:28Z

after enabling the PT_XLA_DEBUG=1 one should see

Compilation Analysis: ================================================================================
Compilation Analysis: Compilation Cause
Compilation Analysis:   mark_step in parallel loader at step end
Compilation Analysis: Graph Info: 
Compilation Analysis:   Graph Hash: aaf3f25f7c1fa61b301aa41ebec815d0
Compilation Analysis:   Number of Graph Inputs: 325
Compilation Analysis:   Number of Graph Outputs: 485
Compilation Analysis: Python Frame Triggered Execution: 
Compilation Analysis:   mark_step (/workspaces/dk3/pytorch/xla/torch_xla/core/xla_model.py:1055)
Compilation Analysis:   next (/workspaces/dk3/pytorch/xla/torch_xla/distributed/parallel_loader.py:44)
Compilation Analysis:   __next__ (/workspaces/dk3/pytorch/xla/torch_xla/distributed/parallel_loader.py:32)
Compilation Analysis:   train_loop_fn (/workspaces/dk3/pytorch/xla/examples/train_resnet_base.py:47)
Compilation Analysis:   start_training (/workspaces/dk3/pytorch/xla/examples/train_resnet_base.py:63)
Compilation Analysis:   <module> (/workspaces/dk3/pytorch/xla/examples/train_resnet_base.py:71)
Compilation Analysis: --------------------------------------------------------------------------------
Compilation Analysis: ================================================================================

Post Compilation Analysis: ================================================================================
Post Compilation Analysis: Input size: 171MB
Post Compilation Analysis: Output size: 204MB
Post Compilation Analysis: Aliased Input size: 0MB
Post Compilation Analysis: Intermediate tensor size: 8491MB
Post Compilation Analysis: Compiled program size: 69MB
Post Compilation Analysis: --------------------------------------------------------------------------------
Post Compilation Analysis: ================================================================================

JackCaoG · 2024-05-21T02:14:42Z

FYI @WoosukKwon, I decided to just let PT_XLA_DEBUG=1 to print this info. It would be hard to give this info on demand since you will need to pass in a hash for the graph. Let me know if you want a different ux.

wonjoo-wj · 2024-05-21T07:09:45Z

Seems like this unit test (https://github.com/pytorch/xla/blob/master/test/debug_tool/test_mp_pt_xla_debug.py#L35) checks specifically on # of pt-xla-debug output lines and fails 😆

Otherwise LGTM.

JackCaoG added 3 commits May 20, 2024 22:56

Dump HLO HBM usage info

b4d83e4

Append graph memory usage info for PT_XLA_DEBUG=1

e4b1a7e

add test

a05f5a1

JackCaoG requested review from will-cromar and wonjoo-wj May 21, 2024 02:17

JackCaoG marked this pull request as ready for review May 21, 2024 02:17

fix test

694fde6

wonjoo-wj approved these changes May 21, 2024

View reviewed changes

JackCaoG merged commit 206f1b7 into master May 21, 2024

qihqi pushed a commit that referenced this pull request May 29, 2024

Dump HLO HBM usage info (#7085)

ee540de

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dump HLO HBM usage info#7085

Dump HLO HBM usage info#7085
JackCaoG merged 4 commits intomasterfrom
JackCaoG/graph_hbm_usage

JackCaoG commented May 20, 2024 •

edited

Loading

Uh oh!

JackCaoG commented May 21, 2024

Uh oh!

wonjoo-wj commented May 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JackCaoG commented May 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JackCaoG commented May 21, 2024

Uh oh!

wonjoo-wj commented May 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JackCaoG commented May 20, 2024 •

edited

Loading