Add torch.nn.init.uniform_ operator to ShardedTensor.#63997
Conversation
… to mimic [torch.nn.init.normal_, uniform_, kaiming_uniform_]
Summary:
Note _sharded_tensor module is a temporary place.
Ideally we want something like torch.nn.init(ShardedTensor, ...) to ensure consistent UX with Tensor.
To support that, we need either:
a) Update torch/nn/init.py with normal_(ShardedTensor, ), uniform_(ShardedTensor,...), and kaiming_uniform_(ShardedTensor, ...), or
b) Add torch.nn.init.{funcs} into __torch_function__ dispatchers (Currently __torch_function__ does not handle these funcs)
Test Plan:
(pytorch) ... $ python test/distributed/_sharded_tensor/test_sharded_tensor.py TestShardedTensorNNInit --v
Reviewers:
Subscribers:
Tasks:
Tags:
[ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit a570b13 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
|
@bowangbj has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
…ing_uniform_] utils to mimic torch.nn.init.[normal_, uniform_, kaiming_uniform_]"
Summary:
Note _sharded_tensor module is a temporary place.
Ideally we want something like torch.nn.init(ShardedTensor, ...) to ensure consistent UX with Tensor.
To support that, we need either:
a) Update torch/nn/init.py with normal_(ShardedTensor, ), uniform_(ShardedTensor,...), and kaiming_uniform_(ShardedTensor, ...), or
b) Add torch.nn.init.{funcs} into __torch_function__ dispatchers (Currently __torch_function__ does not handle these funcs)
Test Plan:
(pytorch) ... $ python test/distributed/_sharded_tensor/test_sharded_tensor.py TestShardedTensorNNInit --v
Reviewers:
Subscribers:
Tasks:
Tags:
tmp
Differential Revision: [D30563017](https://our.internmc.facebook.com/intern/diff/D30563017)
[ghstack-poisoned]
…rm_, kaiming_uniform_] utils to mimic torch.nn.init.[normal_, uniform_, kaiming_uniform_]"
Summary:
Note _sharded_tensor module is a temporary place.
Ideally we want something like torch.nn.init(ShardedTensor, ...) to ensure consistent UX with Tensor.
To support that, we need either:
a) Update torch/nn/init.py with normal_(ShardedTensor, ), uniform_(ShardedTensor,...), and kaiming_uniform_(ShardedTensor, ...), or
b) Add torch.nn.init.{funcs} into __torch_function__ dispatchers (Currently __torch_function__ does not handle these funcs)
Test Plan:
(pytorch) ... $ python test/distributed/_sharded_tensor/test_sharded_tensor.py TestShardedTensorNNInit --v
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: [D30563017](https://our.internmc.facebook.com/intern/diff/D30563017)
[ghstack-poisoned]
… to mimic torch.nn.init.[normal_, uniform_, kaiming_uniform_]
Summary:
Note _sharded_tensor module is a temporary place.
Ideally we want something like torch.nn.init(ShardedTensor, ...) to ensure consistent UX with Tensor.
To support that, we need either:
a) Update torch/nn/init.py with normal_(ShardedTensor, ), uniform_(ShardedTensor,...), and kaiming_uniform_(ShardedTensor, ...), or
b) Add torch.nn.init.{funcs} into __torch_function__ dispatchers (Currently __torch_function__ does not handle these funcs)
Test Plan:
(pytorch) ... $ python test/distributed/_sharded_tensor/test_sharded_tensor.py TestShardedTensorNNInit --v
Reviewers:
Subscribers:
Tasks:
Tags:
ghstack-source-id: c7712e4
Pull Request resolved: #63997
|
@bowangbj has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slowFor more information, please take a look at the CI Flow Wiki. |
…nsor." Summary: Use torch_function to extend torch.nn.init.uniform_ The Init is done in SPMD fashion. Note that ideally we want to aggregate sharded tensors into a global tensor, init it and reshard. It's fine to run it SPMD since uniform is I.I.D indepenent and identifically distributed. Also enable unit test for test_linear.py for OSS test Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_linear.py --v (before runs this command is no-op) or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D30563017](https://our.internmc.facebook.com/intern/diff/D30563017) [ghstack-poisoned]
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slowFor more information, please take a look at the CI Flow Wiki. |
1 similar comment
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slowFor more information, please take a look at the CI Flow Wiki. |
|
@bowangbj has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
Thanks Wanchao and Pritam, resolved all the cmts, ready to submit. Will follow up the args / kargs issue in follow up PR. |
Summary: Use torch_function to extend torch.nn.init.uniform_ The Init is done in SPMD fashion. Note that ideally we want to aggregate sharded tensors into a global tensor, init it and reshard. It's fine to run it SPMD since uniform is I.I.D indepenent and identifically distributed. Also enable unit test for test_linear.py for OSS test Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_linear.py --v (before runs this command is no-op) or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D30563017](https://our.internmc.facebook.com/intern/diff/D30563017) [ghstack-poisoned]
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slowFor more information, please take a look at the CI Flow Wiki. |
Summary: Use torch_function to extend torch.nn.init.uniform_ The Init is done in SPMD fashion. Note that ideally we want to aggregate sharded tensors into a global tensor, init it and reshard. It's fine to run it SPMD since uniform is I.I.D indepenent and identifically distributed. Also enable unit test for test_linear.py for OSS test Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_linear.py --v (before runs this command is no-op) or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 99e45f7 Pull Request resolved: #63997
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slowFor more information, please take a look at the CI Flow Wiki. |
|
@bowangbj has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
This pull request has been merged in b6df043. |
…hardedTensor Summary: Extend ShardedTensor with torch.nn.init.[normal_, and kaiming_uniform_] ops Follow up from #63997 Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# s/uniform_/normal_ or kaiming_uniform_ Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
…kaiming_uniform_ ops to ShardedTensor" Summary: Extend ShardedTensor with torch.nn.init.[normal_, and kaiming_uniform_] ops Follow up from #63997 Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# s/uniform_/normal_ or kaiming_uniform_ Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D31845654](https://our.internmc.facebook.com/intern/diff/D31845654) [ghstack-poisoned]
… ops to ShardedTensor" Summary: Extend ShardedTensor with torch.nn.init.[normal_, and kaiming_uniform_] ops Follow up from #63997 Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# s/uniform_/normal_ or kaiming_uniform_ Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D31845654](https://our.internmc.facebook.com/intern/diff/D31845654) [ghstack-poisoned]
…t.kaiming_uniform_ ops to ShardedTensor" Summary: Extend ShardedTensor with torch.nn.init.[normal_, and kaiming_uniform_] ops Follow up from #63997 Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# s/uniform_/normal_ or kaiming_uniform_ Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D31845654](https://our.internmc.facebook.com/intern/diff/D31845654) [ghstack-poisoned]
…m_ ops to ShardedTensor" Summary: Extend ShardedTensor with torch.nn.init.[normal_, and kaiming_uniform_] ops Follow up from #63997 Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# s/uniform_/normal_ or kaiming_uniform_ Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D31845654](https://our.internmc.facebook.com/intern/diff/D31845654) [ghstack-poisoned]
…hardedTensor Summary: Extend ShardedTensor with torch.nn.init.[normal_, and kaiming_uniform_] ops Follow up from #63997 Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# s/uniform_/normal_ or kaiming_uniform_ Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 012f671 Pull Request resolved: #67057
…torch.nn.init.kaiming_uniform_ ops to ShardedTensor" Summary: Extend ShardedTensor with torch.nn.init.[normal_, and kaiming_uniform_] ops Follow up from #63997 Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# s/uniform_/normal_ or kaiming_uniform_ Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D31845654](https://our.internmc.facebook.com/intern/diff/D31845654) [ghstack-poisoned]
…iming_uniform_ ops to ShardedTensor" Summary: Extend ShardedTensor with torch.nn.init.[normal_, and kaiming_uniform_] ops Follow up from #63997 Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# s/uniform_/normal_ or kaiming_uniform_ Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D31845654](https://our.internmc.facebook.com/intern/diff/D31845654) [ghstack-poisoned]
…hardedTensor Summary: Extend ShardedTensor with torch.nn.init.[normal_, and kaiming_uniform_] ops Follow up from #63997 Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# s/uniform_/normal_ or kaiming_uniform_ Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 54ce4ba Pull Request resolved: #67057
… and torch.nn.init.kaiming_uniform_ ops to ShardedTensor" Summary: Extend ShardedTensor with torch.nn.init.[normal_, and kaiming_uniform_] ops Follow up from #63997 Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# s/uniform_/normal_ or kaiming_uniform_ Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D31845654](https://our.internmc.facebook.com/intern/diff/D31845654) [ghstack-poisoned]
Stack from ghstack:
Summary:
Use torch_function to extend torch.nn.init.uniform_
The Init is done in SPMD fashion. Note that ideally we want to aggregate sharded tensors into a global tensor, init it and reshard. It's fine to run it SPMD since uniform is I.I.D indepenent and identifically distributed.
Also enable unit test for test_linear.py for OSS test
Test Plan:
a) Unit Test
(pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v
(pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_linear.py --v (before runs this command is no-op)
or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit#
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D30563017
cc @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @SciPioneer @H-Huang