Refactor dispatch and redistribute to expose local tensor APIs by mrshenli · Pull Request #476 · pytorch/PiPPy

mrshenli · 2022-09-19T03:26:50Z

Stack from ghstack (oldest at bottom):

Since make_fx cannot yet handle tensor subclasses correctly,
refactor dispatch code to expose APIs that takes local tensors and
then trace these APIs instead.

[ghstack-poisoned]

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

[ghstack-poisoned]

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. ghstack-source-id: ab716d0 Pull Request resolved: #476

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. ghstack-source-id: e8afede Pull Request resolved: #476

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. ghstack-source-id: 8b9c92f Pull Request resolved: #476

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. ghstack-source-id: ab161a8 Pull Request resolved: #476

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. ghstack-source-id: eff09f6 Pull Request resolved: #476

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

aazzolini

Could you provide a pseudo-code of the intended use of these functions exposed?

aazzolini · 2022-09-19T19:06:09Z

spmd/tensor/dispatch.py



-def operator_dispatch(
+def prepare_inputs(


Is the goal to use prepare_inputs directly, and bypass operator_dispatch, when using make_fx?

If this is the case, then we will probably miss some operators' implementations that directly use DTensor.

Could you provide a quick pseudo-code of the intended call sequence?

This is how it is used in the PR on top

https://github.com/pytorch/tau/blob/a06ce4426bbddb84dc75eb9f0c10894c5c80bf41/test/spmd/test_tracing.py#L267-L301

we use make_fx to trace two things 1. redistributed inputs 2. local op. IIUC, we don't redistribute output at the moment in DT? If we do that in the future, we will also needs to add that to trace as well.

Ideally, I wanted to trace dispatch_with_local_tensors(local_args, arg_specs, output_specs), which will trigger redistribute on input, local comp op, redistribute on output.

The issue i see is that if we do it this way we will be missing 1) the decompositions; 2) the custom op implementations (those that don't have a propagation rule, but that require direct DTensor-aware implementation). I think for some ops we will actually need those.

Would it be possible to work around make_fx limitations somehow and still be able to trace the implementation of DTensor at some level?

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

wanchaol · 2022-09-19T20:07:32Z

spmd/tensor/redistribute.py

+    return new_local_tensor
+
+
+def redistribute_spmd_tensor(


can we merge this with the Redistribute autograd function?

In that case, I assume pack_args_kwargs_with_local_tensor will then also need to call into Redistribute autograd function, which occurs in the __torch_dispatch__ function under no_grad mode?

https://github.com/pytorch/tau/blob/c71e2866015ca23beee4e17c9d8dae415d5f86b4/spmd/tensor/utils.py#L57

Oh I mean we should probably also change pack_args_kwargs_with_local_tensor by calling into _redistribute_with_local_tensor directly? so that we could safely delete this redistribute_with_spmd_tensor api

sure, let me update that.

Hey @wanchaol , I tried update pack_args_kwargs_with_local_tensor and remove redistributed_dtensor, but code becomes a bit verbose, as _redistribute_with_local_tensor take more argument and requires one additional DTensor wrapping. Let me know if you prefer to get rid of redistributed_dtensor. I can do that in a follow up PR.

spmd/tensor/dispatch.py

wanchaol · 2022-09-22T22:54:54Z

spmd/tensor/redistribute.py

+    return new_local_tensor
+
+
+def redistribute_spmd_tensor(


Oh I mean we should probably also change pack_args_kwargs_with_local_tensor by calling into _redistribute_with_local_tensor directly? so that we could safely delete this redistribute_with_spmd_tensor api

wanchaol

lgtm!

spmd/tensor/dispatch.py

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. ghstack-source-id: 039fd93 Pull Request resolved: #476

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. ghstack-source-id: 825c215 Pull Request resolved: #476

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. ghstack-source-id: 43857fe Pull Request resolved: #476

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead.

mrshenli added 8 commits September 13, 2022 20:45

Enable tracing for DeviceMesh

7bc93d3

[ghstack-poisoned]

Update on "Enable tracing for DeviceMesh"

af33394

[ghstack-poisoned]

Update on "Enable tracing for DeviceMesh"

30cffae

[ghstack-poisoned]

Update on "Enable tracing for DeviceMesh"

dc34bde

[ghstack-poisoned]

Add both 2d and 3D DeviceMesh tests

5518254

[ghstack-poisoned]

Update base for Update on "Add both 2d and 3D DeviceMesh tests"

255fb34

[ghstack-poisoned]

Refactor dispatch and redistribute to expose local tensor APIs

a2bdbf1

Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Update on "Add both 2d and 3D DeviceMesh tests"

b209715

[ghstack-poisoned]

This was referenced Sep 19, 2022

Enable tracing for DeviceMesh #456

Merged

Add both 2d and 3D DeviceMesh tests #459

Merged

facebook-github-bot added the cla signed label Sep 19, 2022

Update on "Refactor dispatch and redistribute to expose local tensor …

f5f3fe2

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Update on "Refactor dispatch and redistribute to expose local tensor …

d31ea84

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Update on "Refactor dispatch and redistribute to expose local tensor …

9274ede

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Update on "Refactor dispatch and redistribute to expose local tensor …

8899155

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

mrshenli mentioned this pull request Sep 19, 2022

Trace local graph and then replace nodes with DT's dispatch graph #460

Merged

mrshenli added 2 commits September 19, 2022 18:19

Update base for Update on "Refactor dispatch and redistribute to expo…

9768644

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Update on "Refactor dispatch and redistribute to expose local tensor …

e27ab18

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

aazzolini reviewed Sep 19, 2022

View reviewed changes

mrshenli added 2 commits September 19, 2022 19:39

Update base for Update on "Refactor dispatch and redistribute to expo…

33ceedc

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Update on "Refactor dispatch and redistribute to expose local tensor …

79080ae

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

mrshenli mentioned this pull request Sep 19, 2022

Enable tracing for DeviceMesh #479

Merged

mrshenli added 2 commits September 19, 2022 19:40

Update base for Update on "Refactor dispatch and redistribute to expo…

3a1c78f

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Update on "Refactor dispatch and redistribute to expose local tensor …

c71e286

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

wanchaol reviewed Sep 19, 2022

View reviewed changes

spmd/tensor/dispatch.py Show resolved Hide resolved

wanchaol reviewed Sep 22, 2022

View reviewed changes

wanchaol approved these changes Sep 28, 2022

View reviewed changes

spmd/tensor/dispatch.py Show resolved Hide resolved

mrshenli added 2 commits October 3, 2022 14:53

Update base for Update on "Refactor dispatch and redistribute to expo…

25aaad4

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Update on "Refactor dispatch and redistribute to expose local tensor …

52890b5

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

mrshenli added 2 commits October 3, 2022 15:00

Update base for Update on "Refactor dispatch and redistribute to expo…

76689f5

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Update on "Refactor dispatch and redistribute to expose local tensor …

49d99ee

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

mrshenli mentioned this pull request Oct 3, 2022

fix GPU test infra #521

Merged

mrshenli added 2 commits October 3, 2022 15:26

Update base for Update on "Refactor dispatch and redistribute to expo…

5ba96bc

…se local tensor APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

Update on "Refactor dispatch and redistribute to expose local tensor …

d2f0d3e

…APIs" Since `make_fx` cannot yet handle tensor subclasses correctly, refactor dispatch code to expose APIs that takes local tensors and then trace these APIs instead. [ghstack-poisoned]

mrshenli changed the base branch from gh/mrshenli/4/base to main October 3, 2022 16:40

mrshenli merged commit 5e38ab6 into main Oct 3, 2022

facebook-github-bot deleted the gh/mrshenli/4/head branch November 3, 2022 14:19



		def operator_dispatch(
		def prepare_inputs(

Conversation

mrshenli commented Sep 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aazzolini left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wanchaol left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mrshenli commented Sep 19, 2022 •

edited

Loading