Skip to content

Dump HLO HBM usage info#7085

Merged
JackCaoG merged 4 commits intomasterfrom
JackCaoG/graph_hbm_usage
May 21, 2024
Merged

Dump HLO HBM usage info#7085
JackCaoG merged 4 commits intomasterfrom
JackCaoG/graph_hbm_usage

Conversation

@JackCaoG
Copy link
Copy Markdown
Collaborator

@JackCaoG JackCaoG commented May 20, 2024

after enabling the PT_XLA_DEBUG=1 one should see

Compilation Analysis: ================================================================================
Compilation Analysis: Compilation Cause
Compilation Analysis:   mark_step in parallel loader at step end
Compilation Analysis: Graph Info: 
Compilation Analysis:   Graph Hash: aaf3f25f7c1fa61b301aa41ebec815d0
Compilation Analysis:   Number of Graph Inputs: 325
Compilation Analysis:   Number of Graph Outputs: 485
Compilation Analysis: Python Frame Triggered Execution: 
Compilation Analysis:   mark_step (/workspaces/dk3/pytorch/xla/torch_xla/core/xla_model.py:1055)
Compilation Analysis:   next (/workspaces/dk3/pytorch/xla/torch_xla/distributed/parallel_loader.py:44)
Compilation Analysis:   __next__ (/workspaces/dk3/pytorch/xla/torch_xla/distributed/parallel_loader.py:32)
Compilation Analysis:   train_loop_fn (/workspaces/dk3/pytorch/xla/examples/train_resnet_base.py:47)
Compilation Analysis:   start_training (/workspaces/dk3/pytorch/xla/examples/train_resnet_base.py:63)
Compilation Analysis:   <module> (/workspaces/dk3/pytorch/xla/examples/train_resnet_base.py:71)
Compilation Analysis: --------------------------------------------------------------------------------
Compilation Analysis: ================================================================================

Post Compilation Analysis: ================================================================================
Post Compilation Analysis: Input size: 171MB
Post Compilation Analysis: Output size: 204MB
Post Compilation Analysis: Aliased Input size: 0MB
Post Compilation Analysis: Intermediate tensor size: 8491MB
Post Compilation Analysis: Compiled program size: 69MB
Post Compilation Analysis: --------------------------------------------------------------------------------
Post Compilation Analysis: ================================================================================

@JackCaoG
Copy link
Copy Markdown
Collaborator Author

FYI @WoosukKwon, I decided to just let PT_XLA_DEBUG=1 to print this info. It would be hard to give this info on demand since you will need to pass in a hash for the graph. Let me know if you want a different ux.

@JackCaoG JackCaoG requested review from will-cromar and wonjoo-wj May 21, 2024 02:17
@JackCaoG JackCaoG marked this pull request as ready for review May 21, 2024 02:17
@wonjoo-wj
Copy link
Copy Markdown
Collaborator

Seems like this unit test (https://github.com/pytorch/xla/blob/master/test/debug_tool/test_mp_pt_xla_debug.py#L35) checks specifically on # of pt-xla-debug output lines and fails 😆

Otherwise LGTM.

@JackCaoG JackCaoG merged commit 206f1b7 into master May 21, 2024
qihqi pushed a commit that referenced this pull request May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants