Skip to content
This repository was archived by the owner on Mar 20, 2026. It is now read-only.

using foreach to reduce kernel#4904

Merged
000Justin000 merged 11 commits intomainfrom
foreach
Dec 17, 2022
Merged

using foreach to reduce kernel#4904
000Justin000 merged 11 commits intomainfrom
foreach

Conversation

@000Justin000
Copy link
Copy Markdown
Contributor

@000Justin000 000Justin000 commented Dec 12, 2022

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
    No
  • Did you read the contributor guideline?
    Yes
  • Did you make sure to update the docs?
    Yes
  • Did you write any new necessary tests?
    No

What does this PR do?

The existing Adam optimizer used by speech models is not efficient as it has many small operators (as shown in the below picture). As the speech team does not use latest PyTorch version Adam optimizer with the multi_tensor (foreach fusion) support, this diff is to support multi_tensor Adam implementation (by fusing many smaller operators into fewer number of operators via PyTorch foreach APIs) for the customized adam_sam optimizer used by the speech team.
image

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

Comment thread fairseq/optim/fairseq_optimizer.py Outdated
p.grad.data.mul_(c)
params_with_grad.append(p.grad.data)
# foreach reduces gpu kernel launch
torch._foreach_mul_(params_with_grad, c)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we safe with _foreach_mul_ here? As far as I know it is not part of PyTorch's public API.

Copy link
Copy Markdown
Contributor

@cbalioglu cbalioglu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our offline discussion with Junteng, approving it.

@000Justin000 000Justin000 merged commit 3c1abb5 into main Dec 17, 2022
@cbalioglu cbalioglu deleted the foreach branch December 20, 2022 16:41
cbalioglu pushed a commit that referenced this pull request Feb 23, 2023
* fix imports referencing moved metrics.py file

* Make representation computation branchless in TransformerEncoderBase (#4818)

Summary:
We want to make the computation branchless here because fairseq code may be
exported and traced for deployment purposes, and tracing mechanisms can
break the correctness for a captured program if it's dependent on input data.
In this diff we try to rewrite the code to remove one branch so that tracer
can proceed here and preserve the correct semantics of the model.

Test Plan:
CI

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix Torchscript typing in transformer_encoder.py (#4847)

* Add Generative Spoken Dialogue Language Modeling (#4879)

* Update deprecated torch.qr in glow.py example (#4685)

torch.qr is deprecated for a long time and is being removed by pytorch/pytorch#70989.

This PR makes the example compatible with new and old PyTorch versions.

* Emotion Conversion Paper Open Source (#4895)

* data2vec v2.0 (#4903)

data2v2c 2.0
Co-authored-by: Arun Babu <arbabu@fb.com>
Co-authored-by: Wei-Ning Hsu <wnhsu@csail.mit.edu>

* remove missing config entries when loading task from checkpoint (#4905)

* make apex optional (#4906)

* Add file to generate manifests for stop dataset. (#4891)

* Update STOP dataset README to include proper link. (#4892)

* Update README.md (#4893)

* using foreach to reduce kernel (#4904)

* using foreach to reduce kernel

* set reproducibility to looser threshold

* revert optimzer

* update

* update

* update

* update

* update

* update

* update

Co-authored-by: juntengjia <juntengjia@fb.com>

* Update README.md to add data2vec blog post (#4913)

* Update README.md

* Update config to fix circleci failure (#4949)

https://app.circleci.com/pipelines/github/fairinternal/fairseq-py/12635/workflows/3befbae2-79c4-458d-9fc4-aad4484183b4/jobs/26767

* Generative Spoken Dialogue Language Modeling Paper Open Source (#4957)

* wav2vec2_laser (#4968)

* ASR BLEU tool copied from ust branch into main (#4914)

* Add transcript option for asr-bleu (#4981)

---------

Co-authored-by: zhxchen17 <zhxchen17@outlook.com>
Co-authored-by: zhxchen17 <zhxchen17@fb.com>
Co-authored-by: Nguyen Tu Anh <nguyentuanh208@gmail.com>
Co-authored-by: Sergii Dymchenko <kit1980@gmail.com>
Co-authored-by: Felix Kreuk <felixkreuk@gmail.com>
Co-authored-by: Alexei Baevski <alexei.b@gmail.com>
Co-authored-by: padentomasello <pdtomasello@gmail.com>
Co-authored-by: Junteng Jia <juntengjia@hotmail.com>
Co-authored-by: juntengjia <juntengjia@fb.com>
Co-authored-by: arbabu123 <arbabu@fb.com>
Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com>
Co-authored-by: Pierre Andrews <mortimer@fb.com>
Co-authored-by: Ilia Kulikov <kulikov@cs.nyu.edu>
Co-authored-by: Xutai Ma <xutaima@gmail.com>
lwb2099 pushed a commit to lwb2099/fairseq that referenced this pull request Apr 26, 2023
* using foreach to reduce kernel

* set reproducibility to looser threshold

* revert optimzer

* update

* update

* update

* update

* update

* update

* update

Co-authored-by: juntengjia <juntengjia@fb.com>
lwb2099 pushed a commit to lwb2099/fairseq that referenced this pull request Apr 26, 2023
* fix imports referencing moved metrics.py file

* Make representation computation branchless in TransformerEncoderBase (facebookresearch#4818)

Summary:
We want to make the computation branchless here because fairseq code may be
exported and traced for deployment purposes, and tracing mechanisms can
break the correctness for a captured program if it's dependent on input data.
In this diff we try to rewrite the code to remove one branch so that tracer
can proceed here and preserve the correct semantics of the model.

Test Plan:
CI

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix Torchscript typing in transformer_encoder.py (facebookresearch#4847)

* Add Generative Spoken Dialogue Language Modeling (facebookresearch#4879)

* Update deprecated torch.qr in glow.py example (facebookresearch#4685)

torch.qr is deprecated for a long time and is being removed by pytorch/pytorch#70989.

This PR makes the example compatible with new and old PyTorch versions.

* Emotion Conversion Paper Open Source (facebookresearch#4895)

* data2vec v2.0 (facebookresearch#4903)

data2v2c 2.0
Co-authored-by: Arun Babu <arbabu@fb.com>
Co-authored-by: Wei-Ning Hsu <wnhsu@csail.mit.edu>

* remove missing config entries when loading task from checkpoint (facebookresearch#4905)

* make apex optional (facebookresearch#4906)

* Add file to generate manifests for stop dataset. (facebookresearch#4891)

* Update STOP dataset README to include proper link. (facebookresearch#4892)

* Update README.md (facebookresearch#4893)

* using foreach to reduce kernel (facebookresearch#4904)

* using foreach to reduce kernel

* set reproducibility to looser threshold

* revert optimzer

* update

* update

* update

* update

* update

* update

* update

Co-authored-by: juntengjia <juntengjia@fb.com>

* Update README.md to add data2vec blog post (facebookresearch#4913)

* Update README.md

* Update config to fix circleci failure (facebookresearch#4949)

https://app.circleci.com/pipelines/github/fairinternal/fairseq-py/12635/workflows/3befbae2-79c4-458d-9fc4-aad4484183b4/jobs/26767

* Generative Spoken Dialogue Language Modeling Paper Open Source (facebookresearch#4957)

* wav2vec2_laser (facebookresearch#4968)

* ASR BLEU tool copied from ust branch into main (facebookresearch#4914)

* Add transcript option for asr-bleu (facebookresearch#4981)

---------

Co-authored-by: zhxchen17 <zhxchen17@outlook.com>
Co-authored-by: zhxchen17 <zhxchen17@fb.com>
Co-authored-by: Nguyen Tu Anh <nguyentuanh208@gmail.com>
Co-authored-by: Sergii Dymchenko <kit1980@gmail.com>
Co-authored-by: Felix Kreuk <felixkreuk@gmail.com>
Co-authored-by: Alexei Baevski <alexei.b@gmail.com>
Co-authored-by: padentomasello <pdtomasello@gmail.com>
Co-authored-by: Junteng Jia <juntengjia@hotmail.com>
Co-authored-by: juntengjia <juntengjia@fb.com>
Co-authored-by: arbabu123 <arbabu@fb.com>
Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com>
Co-authored-by: Pierre Andrews <mortimer@fb.com>
Co-authored-by: Ilia Kulikov <kulikov@cs.nyu.edu>
Co-authored-by: Xutai Ma <xutaima@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants