Make representation computation branchless in TransformerEncoderBase by zhxchen17 · Pull Request #4818 · facebookresearch/fairseq

zhxchen17 · 2022-10-20T19:16:48Z

Summary:
We want to make the computation branchless here because fairseq code may be exported and traced for deployment purposes, and tracing mechanisms can break the correctness for a captured program if it's dependent on input data. In this diff we try to rewrite the code to remove one branch so that tracer can proceed here and preserve the correct semantics of the model.

Test Plan:
CI

Reviewers:

Subscribers:

Tasks:

Tags:

Before submitting

Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
Did you read the contributor guideline?
Did you make sure to update the docs?
Did you write any new necessary tests?

What does this PR do?

Fixes # (issue).

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

zhxchen17 · 2022-10-20T19:18:32Z

cc @suo

suo · 2022-10-20T19:32:33Z

@dianaml0 do you mind taking a look at this?

tugsbayasgalan · 2022-10-21T05:55:50Z

cc @alexeib

moslehpour · 2022-11-01T18:52:58Z

isn't has_pads a boolean? 'bool' object doesn't have 'type_as' attribute!

has_pads is a Tensor object with a bool scalar, which could be converted by type_as method on Tensor type. That's my understanding.

actually, if the device is "xla" for src_token, it would be a bool which is rare but possible.

Let me update the PR to handle that.

@moslehpour updated. mind looking again?

Great. This should work!
Just one last comment, can you confirm if torchscript support type casting with type_as?

@moslehpour sure, I just tried that:

@torch.jit.script def foo(x, y): return x.type_as(y) a = foo(torch.tensor(True), torch.ones(3, 2)) print(a)

and seems Torchscript can give us correct result:

tensor(1.)

Perfect. Thanks for checking it.

moslehpour · 2022-11-01T22:54:56Z

Perfect. Thanks for checking it.

Summary: We want to make the computation branchless here because fairseq code may be exported and traced for deployment purposes, and tracing mechanisms can break the correctness for a captured program if it's dependent on input data. In this diff we try to rewrite the code to remove one branch so that tracer can proceed here and preserve the correct semantics of the model. Test Plan: CI Reviewers: Subscribers: Tasks: Tags:

alexeib · 2022-11-02T22:03:23Z

thanks!

alexeib · 2022-11-02T22:41:40Z

the tests are failing now because of this change:

Variable 'has_pads' is annotated with type Tensor but is being assigned to a value of type bool:
File "/opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/site-packages/fairseq/models/transformer/transformer_encoder.py", line 205
# compute padding mask
encoder_padding_mask = src_tokens.eq(self.padding_idx)
has_pads: Tensor = (
~~~~~~~~ <--- HERE
torch.tensor(src_tokens.device.type == "xla") or encoder_padding_mask.any()
)
'TransformerEncoderBase.forward_scriptable' is being compiled since it was called from 'TransformerEncoderBase.forward'
File "/opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/site-packages/fairseq/models/transformer/transformer_encoder.py", line 165
Only populated if return_all_hiddens is True.
"""
return self.forward_scriptable(
~~~~~~~~~~~~~~~~~~~~~~~~
src_tokens, src_lengths, return_all_hiddens, token_embeddings
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE

zhxchen17 · 2022-11-03T18:06:30Z

the tests are failing now because of this change:

Variable 'has_pads' is annotated with type Tensor but is being assigned to a value of type bool: File "/opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/site-packages/fairseq/models/transformer/transformer_encoder.py", line 205 # compute padding mask encoder_padding_mask = src_tokens.eq(self.padding_idx) has_pads: Tensor = ( ~~~~~~~~ <--- HERE torch.tensor(src_tokens.device.type == "xla") or encoder_padding_mask.any() ) 'TransformerEncoderBase.forward_scriptable' is being compiled since it was called from 'TransformerEncoderBase.forward' File "/opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/site-packages/fairseq/models/transformer/transformer_encoder.py", line 165 Only populated if return_all_hiddens is True. """ return self.forward_scriptable( ~~~~~~~~~~~~~~~~~~~~~~~~ src_tokens, src_lengths, return_all_hiddens, token_embeddings ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE

@alexeib I'm looking into this issue, and I'll try to send in a fix real quick.

* fix imports referencing moved metrics.py file * Make representation computation branchless in TransformerEncoderBase (#4818) Summary: We want to make the computation branchless here because fairseq code may be exported and traced for deployment purposes, and tracing mechanisms can break the correctness for a captured program if it's dependent on input data. In this diff we try to rewrite the code to remove one branch so that tracer can proceed here and preserve the correct semantics of the model. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: * Fix Torchscript typing in transformer_encoder.py (#4847) * Add Generative Spoken Dialogue Language Modeling (#4879) * Update deprecated torch.qr in glow.py example (#4685) torch.qr is deprecated for a long time and is being removed by pytorch/pytorch#70989. This PR makes the example compatible with new and old PyTorch versions. * Emotion Conversion Paper Open Source (#4895) * data2vec v2.0 (#4903) data2v2c 2.0 Co-authored-by: Arun Babu <arbabu@fb.com> Co-authored-by: Wei-Ning Hsu <wnhsu@csail.mit.edu> * remove missing config entries when loading task from checkpoint (#4905) * make apex optional (#4906) * Add file to generate manifests for stop dataset. (#4891) * Update STOP dataset README to include proper link. (#4892) * Update README.md (#4893) * using foreach to reduce kernel (#4904) * using foreach to reduce kernel * set reproducibility to looser threshold * revert optimzer * update * update * update * update * update * update * update Co-authored-by: juntengjia <juntengjia@fb.com> * Update README.md to add data2vec blog post (#4913) * Update README.md * Update config to fix circleci failure (#4949) https://app.circleci.com/pipelines/github/fairinternal/fairseq-py/12635/workflows/3befbae2-79c4-458d-9fc4-aad4484183b4/jobs/26767 * Generative Spoken Dialogue Language Modeling Paper Open Source (#4957) * wav2vec2_laser (#4968) * ASR BLEU tool copied from ust branch into main (#4914) * Add transcript option for asr-bleu (#4981) --------- Co-authored-by: zhxchen17 <zhxchen17@outlook.com> Co-authored-by: zhxchen17 <zhxchen17@fb.com> Co-authored-by: Nguyen Tu Anh <nguyentuanh208@gmail.com> Co-authored-by: Sergii Dymchenko <kit1980@gmail.com> Co-authored-by: Felix Kreuk <felixkreuk@gmail.com> Co-authored-by: Alexei Baevski <alexei.b@gmail.com> Co-authored-by: padentomasello <pdtomasello@gmail.com> Co-authored-by: Junteng Jia <juntengjia@hotmail.com> Co-authored-by: juntengjia <juntengjia@fb.com> Co-authored-by: arbabu123 <arbabu@fb.com> Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com> Co-authored-by: Pierre Andrews <mortimer@fb.com> Co-authored-by: Ilia Kulikov <kulikov@cs.nyu.edu> Co-authored-by: Xutai Ma <xutaima@gmail.com>

…acebookresearch#4818) Summary: We want to make the computation branchless here because fairseq code may be exported and traced for deployment purposes, and tracing mechanisms can break the correctness for a captured program if it's dependent on input data. In this diff we try to rewrite the code to remove one branch so that tracer can proceed here and preserve the correct semantics of the model. Test Plan: CI Reviewers: Subscribers: Tasks: Tags:

* fix imports referencing moved metrics.py file * Make representation computation branchless in TransformerEncoderBase (facebookresearch#4818) Summary: We want to make the computation branchless here because fairseq code may be exported and traced for deployment purposes, and tracing mechanisms can break the correctness for a captured program if it's dependent on input data. In this diff we try to rewrite the code to remove one branch so that tracer can proceed here and preserve the correct semantics of the model. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: * Fix Torchscript typing in transformer_encoder.py (facebookresearch#4847) * Add Generative Spoken Dialogue Language Modeling (facebookresearch#4879) * Update deprecated torch.qr in glow.py example (facebookresearch#4685) torch.qr is deprecated for a long time and is being removed by pytorch/pytorch#70989. This PR makes the example compatible with new and old PyTorch versions. * Emotion Conversion Paper Open Source (facebookresearch#4895) * data2vec v2.0 (facebookresearch#4903) data2v2c 2.0 Co-authored-by: Arun Babu <arbabu@fb.com> Co-authored-by: Wei-Ning Hsu <wnhsu@csail.mit.edu> * remove missing config entries when loading task from checkpoint (facebookresearch#4905) * make apex optional (facebookresearch#4906) * Add file to generate manifests for stop dataset. (facebookresearch#4891) * Update STOP dataset README to include proper link. (facebookresearch#4892) * Update README.md (facebookresearch#4893) * using foreach to reduce kernel (facebookresearch#4904) * using foreach to reduce kernel * set reproducibility to looser threshold * revert optimzer * update * update * update * update * update * update * update Co-authored-by: juntengjia <juntengjia@fb.com> * Update README.md to add data2vec blog post (facebookresearch#4913) * Update README.md * Update config to fix circleci failure (facebookresearch#4949) https://app.circleci.com/pipelines/github/fairinternal/fairseq-py/12635/workflows/3befbae2-79c4-458d-9fc4-aad4484183b4/jobs/26767 * Generative Spoken Dialogue Language Modeling Paper Open Source (facebookresearch#4957) * wav2vec2_laser (facebookresearch#4968) * ASR BLEU tool copied from ust branch into main (facebookresearch#4914) * Add transcript option for asr-bleu (facebookresearch#4981) --------- Co-authored-by: zhxchen17 <zhxchen17@outlook.com> Co-authored-by: zhxchen17 <zhxchen17@fb.com> Co-authored-by: Nguyen Tu Anh <nguyentuanh208@gmail.com> Co-authored-by: Sergii Dymchenko <kit1980@gmail.com> Co-authored-by: Felix Kreuk <felixkreuk@gmail.com> Co-authored-by: Alexei Baevski <alexei.b@gmail.com> Co-authored-by: padentomasello <pdtomasello@gmail.com> Co-authored-by: Junteng Jia <juntengjia@hotmail.com> Co-authored-by: juntengjia <juntengjia@fb.com> Co-authored-by: arbabu123 <arbabu@fb.com> Co-authored-by: dianaml0 <82468439+dianaml0@users.noreply.github.com> Co-authored-by: Pierre Andrews <mortimer@fb.com> Co-authored-by: Ilia Kulikov <kulikov@cs.nyu.edu> Co-authored-by: Xutai Ma <xutaima@gmail.com>

facebook-github-bot added the CLA Signed label Oct 20, 2022

moslehpour reviewed Nov 1, 2022

View reviewed changes

zhxchen17 force-pushed the main branch from b863f0f to 1942faf Compare November 1, 2022 19:31

zhxchen17 requested a review from moslehpour November 1, 2022 19:36

moslehpour approved these changes Nov 1, 2022

View reviewed changes

Comment thread fairseq/models/transformer/transformer_encoder.py Outdated

Copy link
Copy Markdown

Contributor

moslehpour Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect. Thanks for checking it.

zhxchen17 force-pushed the main branch from 1942faf to 4796335 Compare November 2, 2022 00:46

zhxchen17 force-pushed the main branch from 4796335 to d47bd3d Compare November 2, 2022 20:50

Merge branch 'main' into main

f26fb5b

alexeib merged commit 59d966a into facebookresearch:main Nov 2, 2022

zhxchen17 mentioned this pull request Nov 3, 2022

Fix Torchscript typing in transformer_encoder.py #4847

Merged

4 tasks

Conversation

zhxchen17 commented Oct 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Before submitting

What does this PR do?

PR review

Did you have fun?

Uh oh!

zhxchen17 commented Oct 20, 2022

Uh oh!

suo commented Oct 20, 2022

Uh oh!

tugsbayasgalan commented Oct 21, 2022

Uh oh!

moslehpour Nov 1, 2022

Choose a reason for hiding this comment

Uh oh!

zhxchen17 Nov 1, 2022

Choose a reason for hiding this comment

Uh oh!

zhxchen17 Nov 1, 2022

Choose a reason for hiding this comment

Uh oh!

zhxchen17 Nov 1, 2022

Choose a reason for hiding this comment

Uh oh!

moslehpour Nov 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhxchen17 Nov 1, 2022

Choose a reason for hiding this comment

Uh oh!

moslehpour Nov 1, 2022

Choose a reason for hiding this comment

Uh oh!

moslehpour Nov 1, 2022

Choose a reason for hiding this comment

Uh oh!

alexeib commented Nov 2, 2022

Uh oh!

alexeib commented Nov 2, 2022

Uh oh!

zhxchen17 commented Nov 3, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

zhxchen17 commented Oct 20, 2022 •

edited

Loading

moslehpour Nov 1, 2022 •

edited

Loading