[DTensor] Update dtensor_dispatch_inplace instruction count benchmark by wconstab · Pull Request #177074 · pytorch/pytorch

wconstab · 2026-03-10T21:33:14Z

Stack from ghstack (oldest at bottom):

-> [DTensor] Update dtensor_dispatch_inplace instruction count benchmark #177074

The expected count for dtensor_dispatch_inplace (add_) regressed from
56530 to 58710 (~3.9%) after #175795 registered single-dim strategies
for categorized pointwise ops. The regression is on the cached dispatch
path and comes from two sources: an extra dict lookup in the C++
get_runtime_schema_info_for_op (~890 instructions), and a Python heap
layout difference in the cached OutputSharding object (~2860
instructions). Both are minor and not particularly worth fixing. While
the regression is within the 10% CI noise margin, it's better to reset
the counts so we still have our full 10% margin for the future.

Authored with Claude.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @chauhang @amjames @Lucaskabela @jataylo

The expected count for dtensor_dispatch_inplace (add_) regressed from 56530 to 58710 (~3.9%) after #175795 registered single-dim strategies for categorized pointwise ops. The regression is on the cached dispatch path and comes from two sources: an extra dict lookup in the C++ get_runtime_schema_info_for_op (~890 instructions), and a Python heap layout difference in the cached OutputSharding object (~2860 instructions). Both are minor and not particularly worth fixing. While the regression is within the 10% CI noise margin, it's better to reset the counts so we still have our full 10% margin for the future. Authored with Claude. [ghstack-poisoned]

pytorch-bot · 2026-03-10T21:33:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177074

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 3 Unrelated Failures

As of commit 8cf854e with merge base 3f60bc4 ():

NEW FAILURE - The following job has failed:

pull / linux-jammy-py3.10-gcc11 / test (distributed, 1, 2, lf.linux.2xlarge) (gh)
test/distributed/tensor/test_dtensor_ops.py::TestLocalDTensorOpsCPU::test_dtensor_op_db_nanmean_cpu_float32

FLAKY - The following job failed but was likely due to flakiness present on trunk:

inductor / unit-test / inductor-test / test (inductor, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (disabled by #137684)
test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_cuda_float32

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

inductor / unit-test / inductor-test / test (inductor, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (trunk failure)
test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pca_lowrank_cuda_float32

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable) (gh) (#174929)
detectron2_maskrcnn_r_50_fpn

This comment was automatically generated by Dr. CI and updates every 15 minutes.

The expected count for dtensor_dispatch_inplace (add_) regressed from 56530 to 58710 (~3.9%) after #175795 registered single-dim strategies for categorized pointwise ops. The regression is on the cached dispatch path and comes from two sources: an extra dict lookup in the C++ get_runtime_schema_info_for_op (~890 instructions), and a Python heap layout difference in the cached OutputSharding object (~2860 instructions). Both are minor and not particularly worth fixing. While the regression is within the 10% CI noise margin, it's better to reset the counts so we still have our full 10% margin for the future. Authored with Claude. ghstack-source-id: 1e59629 Pull Request resolved: #177074

anshul-si

LGTM

wconstab · 2026-03-13T16:41:17Z

@pytorchbot merge

pytorch-bot bot added ciflow/inductor module: dynamo topic: not user facing topic category labels Mar 10, 2026

wconstab requested a review from anshul-si March 13, 2026 13:27

anshul-si approved these changes Mar 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DTensor] Update dtensor_dispatch_inplace instruction count benchmark#177074

[DTensor] Update dtensor_dispatch_inplace instruction count benchmark#177074
wconstab wants to merge 1 commit intogh/wconstab/563/basefrom
gh/wconstab/563/head

wconstab commented Mar 10, 2026 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

anshul-si left a comment

Uh oh!

wconstab commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wconstab commented Mar 10, 2026 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177074

❌ 1 New Failure, 3 Unrelated Failures

Uh oh!

anshul-si left a comment

Choose a reason for hiding this comment

Uh oh!

wconstab commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wconstab commented Mar 10, 2026 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Mar 10, 2026 •

edited

Loading