[inductor] TLParse tensor metadata logging + test#160132
[inductor] TLParse tensor metadata logging + test#160132skarjala wants to merge 13 commits intogh/skarjala/17/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160132
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 10e7567 with merge base 01bcf9a ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Starting merge as part of PR stack under #160260 |
1 similar comment
|
Starting merge as part of PR stack under #160260 |
torch/_inductor/debug.py
Outdated
| trace_structured( | ||
| "artifact", | ||
| metadata_fn=lambda: { | ||
| "name": "inductor_tlparse_tensor_meta", |
There was a problem hiding this comment.
this should be combined with the existing runtime estimates artifact
test/dynamo/test_structured_trace.py
Outdated
| for out in op.get("outputs", []): | ||
| self.assertIn("shape", out) | ||
| self.assertIn("stride", out) | ||
| self.assertIn("dtype", out) |
There was a problem hiding this comment.
please write this test using assertExpectedInline so that we can see the artifact's outputs from the test file
test/dynamo/test_structured_trace.py
Outdated
| def fn(x): | ||
| y = x @ x | ||
| return y + 1 |
There was a problem hiding this comment.
this is insufficient test coverage. at the very minimum, we should consider 1 case of each where shapes/stride/dtype are different between input and outputs
test/dynamo/test_structured_trace.py
Outdated
| "shape": [ | ||
| 2 | ||
| ], | ||
| "stride": [ | ||
| 1 | ||
| ] |
There was a problem hiding this comment.
we should include a test with dynamic shapes to test out the to_size_hints codepath
test/dynamo/test_structured_trace.py
Outdated
| self.assertParses() | ||
|
|
||
| @requires_tlparse | ||
| @torch._dynamo.config.patch(dynamic_shapes=True) |
test/dynamo/test_structured_trace.py
Outdated
| w = z.to(torch.float16) | ||
| return w | ||
|
|
||
| compiled = torch.compile(f, backend="inductor", fullgraph=True) |
There was a problem hiding this comment.
either compile with dynamic=True or use mark_dynamic like in the starter tasks
test/dynamo/test_structured_trace.py
Outdated
| simplified_ops = [] | ||
| for op in ops: | ||
| outs = [ | ||
| { | ||
| "shape": out.get("shape", []), | ||
| "stride": out.get("stride", []), | ||
| "dtype": out.get("dtype", None), | ||
| } | ||
| for out in op.get("outputs", []) | ||
| ] | ||
| if outs: | ||
| simplified_ops.append( | ||
| { | ||
| "type": op.get("type", ""), | ||
| "outputs": outs, | ||
| } | ||
| ) | ||
|
|
||
| simplified = ( | ||
| {"ops": simplified_ops[-1:]} if simplified_ops else {"ops": []} | ||
| ) | ||
| actual = json.dumps(simplified, indent=2, sort_keys=True) |
There was a problem hiding this comment.
you could just self.assertExpectedInline(ops, ...)
test/dynamo/test_structured_trace.py
Outdated
| self.assertExpectedInline( | ||
| actual, | ||
| r"""{ | ||
| "ops": [ |
There was a problem hiding this comment.
we should really add tests with at least multiple ops
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot revert -m "broke lint GH job link HUD commit link. landrace with another PR that changed some had_cuda related things" -c landrace |
|
@pytorchbot successfully started a revert job. Check the current status here. |
|
@skarjala your PR has been successfully reverted. |
This reverts commit 2603e40. Reverted #160132 on behalf of https://github.com/clee2000 due to broke lint [GH job link](https://github.com/pytorch/pytorch/actions/runs/17010600949/job/48226137423) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/2603e40be5fa4a66301e6654e34a82a67f2e4913). landrace with another PR that changed some had_cuda related things ([comment](#160132 (comment)))
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 1 checks: pull / linux-jammy-py3.9-clang12 / test (crossref, 2, 2, lf.linux.2xlarge) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Summary: - Add TLParse artifact logging per op with output tensor shape, stride, and dtype for cross-rank aggregation. Testing: - Add test to verify structure and contents of tlparse artifiact Pull Request resolved: pytorch#160132 Approved by: https://github.com/xmfan ghstack dependencies: pytorch#160260
…0132)" This reverts commit 2603e40. Reverted pytorch#160132 on behalf of https://github.com/clee2000 due to broke lint [GH job link](https://github.com/pytorch/pytorch/actions/runs/17010600949/job/48226137423) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/2603e40be5fa4a66301e6654e34a82a67f2e4913). landrace with another PR that changed some had_cuda related things ([comment](pytorch#160132 (comment)))
Summary: - Add TLParse artifact logging per op with output tensor shape, stride, and dtype for cross-rank aggregation. Testing: - Add test to verify structure and contents of tlparse artifiact Pull Request resolved: pytorch#160132 Approved by: https://github.com/xmfan
Summary: - Add TLParse artifact logging per op with output tensor shape, stride, and dtype for cross-rank aggregation. Testing: - Add test to verify structure and contents of tlparse artifiact Pull Request resolved: pytorch#160132 Approved by: https://github.com/xmfan ghstack dependencies: pytorch#160260
…0132)" This reverts commit 2603e40. Reverted pytorch#160132 on behalf of https://github.com/clee2000 due to broke lint [GH job link](https://github.com/pytorch/pytorch/actions/runs/17010600949/job/48226137423) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/2603e40be5fa4a66301e6654e34a82a67f2e4913). landrace with another PR that changed some had_cuda related things ([comment](pytorch#160132 (comment)))
Summary: - Add TLParse artifact logging per op with output tensor shape, stride, and dtype for cross-rank aggregation. Testing: - Add test to verify structure and contents of tlparse artifiact Pull Request resolved: pytorch#160132 Approved by: https://github.com/xmfan
Summary:
Testing:
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @Lucaskabela