This repository was archived by the owner on Aug 1, 2023. It is now read-only.
Stop copying tensors to CPU for torch.unique() in vocab reduction#537
Closed
theweiho wants to merge 1 commit intopytorch:masterfrom
Closed
Stop copying tensors to CPU for torch.unique() in vocab reduction#537theweiho wants to merge 1 commit intopytorch:masterfrom
theweiho wants to merge 1 commit intopytorch:masterfrom
Conversation
b29dff7 to
e2c9d87
Compare
theweiho
added a commit
to theweiho/translate
that referenced
this pull request
May 16, 2019
…torch#537) Summary: Pull Request resolved: pytorch#537 pytorch/pytorch#8899 had added CUDA support for `torch.unique()` pytorch/pytorch#16145 has some timing stats that could be relevant --- Experiment results: https://fb.quip.com/olQOA853j0mb Words per second (`gpu-unique_wps_avg_vs_base`): 1.046x Total train time (`gpu-unique_total_train_time_vs_base`; excl ar_AR-fr_XX): 0.987x Even though train time reduction is pretty minimal (probably overshadowed by random variance, scheduling delay, etc), WPS does seem to be ~5% faster - so might as well land this. Training time for ar_AR-fr_XX increased significantly - but that's b/c it trained for many more updates (`gpu-unique_num_updates_avg_vs_base`) - and also ended up w/ +1.43 BLEU. I think this is probably just an anomaly? Differential Revision: D15073468 fbshipit-source-id: 713288fc7c77f582840f270dd2e343a3b63f8fe5
…torch#537) Summary: Pull Request resolved: pytorch#537 pytorch/pytorch#8899 had added CUDA support for `torch.unique()` pytorch/pytorch#16145 has some timing stats that could be relevant --- Experiment results: https://fb.quip.com/olQOA853j0mb Words per second (`gpu-unique_wps_avg_vs_base`): 1.046x Total train time (`gpu-unique_total_train_time_vs_base`; excl ar_AR-fr_XX): 0.987x Even though train time reduction is pretty minimal (probably overshadowed by random variance, scheduling delay, etc), WPS does seem to be ~5% faster - so might as well land this. Training time for ar_AR-fr_XX increased significantly - but that's b/c it trained for many more updates (`gpu-unique_num_updates_avg_vs_base`) - and also ended up w/ +1.43 BLEU. I think this is probably just an anomaly? Differential Revision: D15073468 fbshipit-source-id: 29c7eaaddd63d629866c7314920fe27b22690603
e2c9d87 to
50b04b8
Compare
|
This pull request has been merged in 2abcc08. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
pytorch/pytorch#8899 had added CUDA support for
torch.unique()pytorch/pytorch#16145 has some timing stats that could be relevant
Experiment results: https://fb.quip.com/olQOA853j0mb
Words per second (
gpu-unique_wps_avg_vs_base): 1.046xTotal train time (
gpu-unique_total_train_time_vs_base; excl ar_AR-fr_XX): 0.987xEven though train time reduction is pretty minimal (probably overshadowed by random variance, scheduling delay, etc), WPS does seem to be ~5% faster - so might as well land this.
Training time for ar_AR-fr_XX increased significantly - but that's b/c it trained for many more updates (
gpu-unique_num_updates_avg_vs_base) - and also ended up w/ +1.43 BLEU. I think this is probably just an anomaly?Differential Revision: D15073468