Skip to content

[dtensor] Register foreach and fused ops via single_dim_strategy#177187

Closed
anshul-si wants to merge 26 commits intogh/anshul-si/107/basefrom
gh/anshul-si/107/head
Closed

[dtensor] Register foreach and fused ops via single_dim_strategy#177187
anshul-si wants to merge 26 commits intogh/anshul-si/107/basefrom
gh/anshul-si/107/head

Conversation

@anshul-si
Copy link
Copy Markdown
Contributor

@anshul-si anshul-si commented Mar 11, 2026

Stack from ghstack (oldest at bottom):

Enable foreach and fused ops to use the single_dim_strategy path:

  • Add foreach ops to existing category lists (binary_additive_ops,
    binary_mul_ops, etc.) and the pointwise_ops list
  • Add fused ops to pointwise_ops list for unified registration
  • Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
  • Modify _register_single_dim_pointwise to use needs_pytree for list
    ops and cross_mesh_indices for fused ops
  • Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Mar 11, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177187

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 958a4d5 with merge base 549e5c7 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot Bot added ciflow/inductor ciflow/torchtitan Run TorchTitan integration tests release notes: distributed (dtensor) release notes category labels Mar 11, 2026
anshul-si added a commit that referenced this pull request Mar 11, 2026
Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

ghstack-source-id: cdfddce
Pull Request resolved: #177187
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
anshul-si added a commit that referenced this pull request Mar 11, 2026
Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

ghstack-source-id: b1cca4f
Pull Request resolved: #177187
@anshul-si anshul-si added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 11, 2026
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
anshul-si added a commit that referenced this pull request Mar 11, 2026
Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

ghstack-source-id: c46d594
Pull Request resolved: #177187
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
@anshul-si anshul-si requested review from pianpwk and wconstab March 12, 2026 01:06
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026
Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

ghstack-source-id: 61151ad
Pull Request resolved: pytorch/pytorch#177187
Comment thread test/distributed/tensor/test_pointwise_ops.py Outdated
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
Merge test_fused_adam, test_fused_adamw_strided_shard_cross_mesh, and
test_fused_adamw_tensor_lr into a single test_fused_optimizer test that
parameterizes over the variant. Reduces code duplication while preserving
all five test scenarios (same mesh, cross mesh, cross mesh with amsgrad,
strided shard cross mesh, and tensor lr).
@pytorch-bot pytorch-bot Bot added the ciflow/dtensor Run DTensor specific tests label Mar 15, 2026
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
@wconstab
Copy link
Copy Markdown
Contributor

@claude review this PR

@claude
Copy link
Copy Markdown

claude Bot commented Mar 20, 2026

Claude finished @wconstab's task in 7m 4s —— View job


PR Review: [dtensor] Register foreach and fused ops via single_dim_strategy

  • Gather context and read changed files
  • Analyze the production code changes
  • Analyze the test code changes
  • Post review feedback

Summary

This PR unifies foreach and fused optimizer ops to use the single_dim_strategy registration path instead of the older register_op_strategy path. Foreach ops are added to the appropriate category lists (binary_additive_ops, binary_mul_ops, etc.) and the pointwise_ops list, while fused ops go into pointwise_ops. The old registration loops for for_each_ops, for_each_linearity_ops, and fused_ops are removed.

The approach is sound — the single_dim_strategy path is the modern path with better support for Dijkstra-based expansion and concrete sharding rules.


Issues Found

1. Dead code: neg_ops defined twice (minor)

neg_ops is defined at line 279 with only 2 entries, then redefined at line 381 with the full list (5 entries including neg.out and foreach variants). The first definition at line 279 is immediately dead.

torch/distributed/tensor/_ops/_pointwise_ops.py:279

Fix this →

2. Dead code: linear_pointwise_ops defined twice (minor)

linear_pointwise_ops is defined at line 482 and then identically at line 832. The first definition is dead. The second one correctly uses the redefined neg_ops (from line 381), so semantically the first one is also slightly wrong (it uses the smaller neg_ops from line 279, missing neg.out and foreach variants), though this doesn't matter since it's overwritten.

torch/distributed/tensor/_ops/_pointwise_ops.py:482

Fix this →

3. Dead code: for_each_ops, for_each_linearity_ops lists, and related functions (cleanup opportunity)

The registration loops for for_each_ops (old register_op_strategy calls) and for_each_linearity_ops were removed, but the lists themselves remain at lines 1194-1253. These lists are no longer referenced anywhere. Similarly:

  • list_pointwise_strategy (line 1256) and list_linear_pointwise_strategy (line 1327) are no longer called
  • fused_ops (line 1334) is only referenced inside list_pointwise_strategy (dead)

These could be cleaned up in a follow-up, though keeping them temporarily is acceptable if there's a rollback concern.

torch/distributed/tensor/_ops/_pointwise_ops.py:1194-1343

4. Tests should be more compact (agree with @wconstab's review comment)

The fused adam/adamw tests (test_fused_adam, test_fused_adamw_strided_shard_cross_mesh, test_fused_adamw_tensor_lr) share a lot of boilerplate — each test creates the same d_params, d_grads, d_exp_avgs, d_exp_avg_sqs, d_state_steps pattern. A shared helper or parameterized test body would reduce ~200 lines of test code significantly.

For example, a helper like _make_fused_adam_inputs(mesh, placements, step_mesh=None) could return all the DTensor lists, and the tests could parameterize over (op_fn, mesh_type, amsgrad, placements).


Design Questions

5. _FUSED_OP_SCALAR_IDX = 5 is fragile

The hardcoded index 5 for state_steps in fused ops relies on the op schema argument ordering not changing. This is inherited from the old code, but worth noting — if the fused adam/adamw schema ever adds or reorders arguments, this silently breaks. A comment noting which argument it refers to would help (there's already a good comment explaining why, just not which arg).

6. _is_list_op prefix matching is broad

_is_list_op (line 123) matches any op starting with aten::_foreach_, aten::_amp_foreach_, or aten::_fused_. This works today because all such ops in the codebase are indeed list ops, but it couples the check to naming conventions rather than schema introspection. Low risk but worth being aware of.


What Looks Good

  • The categorization of foreach ops into the correct partial-placement rule groups (additive, mul, div, scalar_linear, non_decreasing, neg, monotonic_max_preserving) is correct.
  • The strategy_fn wrapper for tensor kwargs (handling both out and lr) is cleaner than the old _out_wrapper.
  • The needs_pytree=True vs static_argnum branching based on _is_list_op is correct.
  • The different_mesh_args for fused ops correctly preserves cross-mesh behavior.

Verdict

The production code changes are well-structured and correct. The main feedback is on the test compactness (per @wconstab's comment) and cleaning up the dead code artifacts from the migration. None of the issues are blocking.

Copy link
Copy Markdown
Contributor

@wconstab wconstab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code is removed in the next PR i think, or if any is missed, please remove it

…rategy"

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.

[ghstack-poisoned]
@anshul-si
Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Mar 24, 2026
The old strategy registration path (register_op_strategy-based functions
and data structures) was superseded by the register_single_dim_strategy
infrastructure but never cleaned up. This removes ~550 lines of dead
code including: pointwise_strategy, linear_pointwise_strategy,
copy_strategy, common_pointwise_strategy, single_mesh_dim_* strategy
functions, list_pointwise_strategy, list_linear_pointwise_strategy,
for_each_ops/for_each_linearity_ops/fused_ops lists, and their
associated helper sets and unused imports.

Authored with Claude.
Pull Request resolved: #177208
Approved by: https://github.com/Skylion007
ghstack dependencies: #177186, #177187
anshul-si added a commit that referenced this pull request Mar 31, 2026
The old strategy registration path (register_op_strategy-based functions
and data structures) was superseded by the register_single_dim_strategy
infrastructure but never cleaned up. This removes ~550 lines of dead
code including: pointwise_strategy, linear_pointwise_strategy,
copy_strategy, common_pointwise_strategy, single_mesh_dim_* strategy
functions, list_pointwise_strategy, list_linear_pointwise_strategy,
for_each_ops/for_each_linearity_ops/fused_ops lists, and their
associated helper sets and unused imports.

Authored with Claude.
Pull Request resolved: #177208
Approved by: https://github.com/Skylion007
ghstack dependencies: #177186, #177187
ghstack-source-id: f005cbc
anshul-si added a commit that referenced this pull request Mar 31, 2026
The old strategy registration path (register_op_strategy-based functions
and data structures) was superseded by the register_single_dim_strategy
infrastructure but never cleaned up. This removes ~550 lines of dead
code including: pointwise_strategy, linear_pointwise_strategy,
copy_strategy, common_pointwise_strategy, single_mesh_dim_* strategy
functions, list_pointwise_strategy, list_linear_pointwise_strategy,
for_each_ops/for_each_linearity_ops/fused_ops lists, and their
associated helper sets and unused imports.

Authored with Claude.
Pull Request resolved: #177208
Approved by: https://github.com/Skylion007
ghstack dependencies: #177186, #177187
ghstack-source-id: f4e0b82
AaronWang04 pushed a commit to AaronWang04/pytorch that referenced this pull request Mar 31, 2026
…orch#177187)

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.
Pull Request resolved: pytorch#177187
Approved by: https://github.com/wconstab
ghstack dependencies: pytorch#177186
AaronWang04 pushed a commit to AaronWang04/pytorch that referenced this pull request Mar 31, 2026
The old strategy registration path (register_op_strategy-based functions
and data structures) was superseded by the register_single_dim_strategy
infrastructure but never cleaned up. This removes ~550 lines of dead
code including: pointwise_strategy, linear_pointwise_strategy,
copy_strategy, common_pointwise_strategy, single_mesh_dim_* strategy
functions, list_pointwise_strategy, list_linear_pointwise_strategy,
for_each_ops/for_each_linearity_ops/fused_ops lists, and their
associated helper sets and unused imports.

Authored with Claude.
Pull Request resolved: pytorch#177208
Approved by: https://github.com/Skylion007
ghstack dependencies: pytorch#177186, pytorch#177187
anshul-si added a commit that referenced this pull request Mar 31, 2026
The old strategy registration path (register_op_strategy-based functions
and data structures) was superseded by the register_single_dim_strategy
infrastructure but never cleaned up. This removes ~550 lines of dead
code including: pointwise_strategy, linear_pointwise_strategy,
copy_strategy, common_pointwise_strategy, single_mesh_dim_* strategy
functions, list_pointwise_strategy, list_linear_pointwise_strategy,
for_each_ops/for_each_linearity_ops/fused_ops lists, and their
associated helper sets and unused imports.

Authored with Claude.
Pull Request resolved: #177208
Approved by: https://github.com/Skylion007
ghstack dependencies: #177186, #177187
ghstack-source-id: 5a0cf25
nklshy-aws pushed a commit to nklshy-aws/pytorch that referenced this pull request Apr 7, 2026
…orch#177187)

Enable foreach and fused ops to use the single_dim_strategy path:

- Add foreach ops to existing category lists (binary_additive_ops,
  binary_mul_ops, etc.) and the pointwise_ops list
- Add fused ops to pointwise_ops list for unified registration
- Add _is_list_op() helper to detect foreach/fused/amp_foreach ops
- Modify _register_single_dim_pointwise to use needs_pytree for list
  ops and cross_mesh_indices for fused ops
- Remove separate register_single_dim_strategy loop for fused ops

Tests cover multi-tensor foreach lists, mixed placements across list
elements, same-mesh fused adam, and cross-mesh fused adam exercising
the cross_mesh_indices code path.
Pull Request resolved: pytorch#177187
Approved by: https://github.com/wconstab
ghstack dependencies: pytorch#177186
nklshy-aws pushed a commit to nklshy-aws/pytorch that referenced this pull request Apr 7, 2026
The old strategy registration path (register_op_strategy-based functions
and data structures) was superseded by the register_single_dim_strategy
infrastructure but never cleaned up. This removes ~550 lines of dead
code including: pointwise_strategy, linear_pointwise_strategy,
copy_strategy, common_pointwise_strategy, single_mesh_dim_* strategy
functions, list_pointwise_strategy, list_linear_pointwise_strategy,
for_each_ops/for_each_linearity_ops/fused_ops lists, and their
associated helper sets and unused imports.

Authored with Claude.
Pull Request resolved: pytorch#177208
Approved by: https://github.com/Skylion007
ghstack dependencies: pytorch#177186, pytorch#177187
@github-actions github-actions Bot deleted the gh/anshul-si/107/head branch April 23, 2026 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/dtensor Run DTensor specific tests ciflow/inductor ciflow/torchtitan Run TorchTitan integration tests ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: distributed (dtensor) release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants