Migrate `exp` and `exp_` from the TH to Aten (CUDA) by kshitij12345 · Pull Request #36652 · pytorch/pytorch

kshitij12345 · 2020-04-15T13:17:05Z

Benchmark with same build settings on same system.
gcc : version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
CUDA : 10.1
GPU : 1050ti

import timeit

for n, t in [(10_000, 20000),
             (100_000, 20000)]:
    for dtype in ('torch.half', 'torch.float', 'torch.double'):
        print(f'torch.exp(a) a.numel() == {n} for {t} times {dtype}')
        print(timeit.timeit(f'torch.exp(a); torch.cuda.synchronize()',
                              setup=f'import torch; a=torch.arange({n}, dtype={dtype}, device="cuda")',
                              number=t))

Before:

torch.exp(a) a.numel() == 10000 for 20000 times torch.half
0.3001665159999902
torch.exp(a) a.numel() == 10000 for 20000 times torch.float
0.28265794499998265
torch.exp(a) a.numel() == 10000 for 20000 times torch.double
0.3432170909998149
torch.exp(a) a.numel() == 100000 for 20000 times torch.half
0.32273333800003456
torch.exp(a) a.numel() == 100000 for 20000 times torch.float
0.31498759600003723
torch.exp(a) a.numel() == 100000 for 20000 times torch.double
1.079708754999956

After:

torch.exp(a) a.numel() == 10000 for 20000 times torch.half
0.27996097300092515
torch.exp(a) a.numel() == 10000 for 20000 times torch.float
0.2774473429999489
torch.exp(a) a.numel() == 10000 for 20000 times torch.double
0.33066844799941464
torch.exp(a) a.numel() == 100000 for 20000 times torch.half
0.27641824200145493
torch.exp(a) a.numel() == 100000 for 20000 times torch.float
0.27805968599932385
torch.exp(a) a.numel() == 100000 for 20000 times torch.double
1.0644143180015817

kshitij12345 · 2020-04-15T13:21:48Z

@VitalyFedyunin @ifedan @xuhdev Please review.

dr-ci · 2020-04-15T14:01:40Z

💊 CI failures summary and remediations

As of commit a5a29be (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)

ci.pytorch.org: 1 failed

Failed: pr/py3.6-clang7-rocmdeb-ubuntu16.04

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 34 times.

kshitij12345 · 2020-04-16T14:09:53Z

@pytorchbot retest this please

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

gchanan · 2020-05-05T16:37:18Z

can we not merge this until #37582 merges? Thanks!

kshitij12345 · 2020-05-05T17:30:23Z

Sure. Will rebase once #37582 is in.

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

kshitij12345 · 2020-05-08T18:03:48Z

@VitalyFedyunin Gentle ping for reminder :)

cloudhan · 2020-05-13T02:46:55Z

@kshitij12345 I think you should rebase? Since @VitalyFedyunin has approved, the bot will do the merge for you.

kshitij12345 · 2020-05-13T08:39:59Z

@cloudhan Thanks will do that.

…24561 Benchmark with same build settings on same system. gcc : version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) CUDA : 10.1 GPU : 1050ti ```python import timeit for n, t in [(10_000, 20000), (100_000, 20000)]: for dtype in ('torch.half', 'torch.float', 'torch.double'): print(f'torch.exp(a) a.numel() == {n} for {t} times {dtype}') print(timeit.timeit(f'torch.exp(a); torch.cuda.synchronize()', setup=f'import torch; a=torch.arange({n}, dtype={dtype}, device="cuda")', number=t)) ``` Before: ``` torch.exp(a) a.numel() == 10000 for 20000 times torch.half 0.3001665159999902 torch.exp(a) a.numel() == 10000 for 20000 times torch.float 0.28265794499998265 torch.exp(a) a.numel() == 10000 for 20000 times torch.double 0.3432170909998149 torch.exp(a) a.numel() == 100000 for 20000 times torch.half 0.32273333800003456 torch.exp(a) a.numel() == 100000 for 20000 times torch.float 0.31498759600003723 torch.exp(a) a.numel() == 100000 for 20000 times torch.double 1.079708754999956 ``` After: ``` torch.exp(a) a.numel() == 10000 for 20000 times torch.half 0.27996097300092515 torch.exp(a) a.numel() == 10000 for 20000 times torch.float 0.2774473429999489 torch.exp(a) a.numel() == 10000 for 20000 times torch.double 0.33066844799941464 torch.exp(a) a.numel() == 100000 for 20000 times torch.half 0.27641824200145493 torch.exp(a) a.numel() == 100000 for 20000 times torch.float 0.27805968599932385 torch.exp(a) a.numel() == 100000 for 20000 times torch.double 1.0644143180015817 ```

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-05-13T18:11:27Z

@VitalyFedyunin merged this pull request in d86de91.

Summary: Closes pytorch#24561 Benchmark with same build settings on same system. gcc : version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) CUDA : 10.1 GPU : 1050ti ```python import timeit for n, t in [(10_000, 20000), (100_000, 20000)]: for dtype in ('torch.half', 'torch.float', 'torch.double'): print(f'torch.exp(a) a.numel() == {n} for {t} times {dtype}') print(timeit.timeit(f'torch.exp(a); torch.cuda.synchronize()', setup=f'import torch; a=torch.arange({n}, dtype={dtype}, device="cuda")', number=t)) ``` Before: ``` torch.exp(a) a.numel() == 10000 for 20000 times torch.half 0.3001665159999902 torch.exp(a) a.numel() == 10000 for 20000 times torch.float 0.28265794499998265 torch.exp(a) a.numel() == 10000 for 20000 times torch.double 0.3432170909998149 torch.exp(a) a.numel() == 100000 for 20000 times torch.half 0.32273333800003456 torch.exp(a) a.numel() == 100000 for 20000 times torch.float 0.31498759600003723 torch.exp(a) a.numel() == 100000 for 20000 times torch.double 1.079708754999956 ``` After: ``` torch.exp(a) a.numel() == 10000 for 20000 times torch.half 0.27996097300092515 torch.exp(a) a.numel() == 10000 for 20000 times torch.float 0.2774473429999489 torch.exp(a) a.numel() == 10000 for 20000 times torch.double 0.33066844799941464 torch.exp(a) a.numel() == 100000 for 20000 times torch.half 0.27641824200145493 torch.exp(a) a.numel() == 100000 for 20000 times torch.float 0.27805968599932385 torch.exp(a) a.numel() == 100000 for 20000 times torch.double 1.0644143180015817 ``` Pull Request resolved: pytorch#36652 Differential Revision: D21164653 Pulled By: VitalyFedyunin fbshipit-source-id: 42c7b24b0d85ff1d390231f1457968a8869b8db3

pytorchbot added the open source label Apr 15, 2020

xuhdev reviewed Apr 15, 2020

View reviewed changes

Comment thread aten/src/THCUNN/THCHalfAutoNumerics.cuh Outdated

kshitij12345 force-pushed the migrate/exp_cuda branch 2 times, most recently from 13c447e to ab045be Compare April 17, 2020 14:11

ngimel requested a review from VitalyFedyunin April 18, 2020 06:08

ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 18, 2020

VitalyFedyunin approved these changes Apr 21, 2020

View reviewed changes

facebook-github-bot reviewed Apr 21, 2020

View reviewed changes

kshitij12345 force-pushed the migrate/exp_cuda branch from ab045be to ec0df58 Compare May 5, 2020 18:51

facebook-github-bot reviewed May 6, 2020

View reviewed changes

kshitij12345 added 5 commits May 13, 2020 16:57

remove THCNumerics namespace qualifier for ::exp

836ef97

fix typo and use correct call

3e96724

add namespace qualifier in LossCTC to resolve ambiguity

db2f648

address comment: remove unused function

a5a29be

kshitij12345 force-pushed the migrate/exp_cuda branch from ec0df58 to a5a29be Compare May 13, 2020 11:28

facebook-github-bot reviewed May 13, 2020

View reviewed changes

facebook-github-bot closed this in d86de91 May 13, 2020

facebook-github-bot added the merged label May 13, 2020

kshitij12345 deleted the migrate/exp_cuda branch May 27, 2020 15:47

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate `exp` and `exp_` from the TH to Aten (CUDA) #36652

Migrate `exp` and `exp_` from the TH to Aten (CUDA) #36652
kshitij12345 wants to merge 5 commits intopytorch:masterfrom
kshitij12345:migrate/exp_cuda

kshitij12345 commented Apr 15, 2020

Uh oh!

kshitij12345 commented Apr 15, 2020

Uh oh!

dr-ci Bot commented Apr 15, 2020 •

edited

Loading

Uh oh!

Uh oh!

kshitij12345 commented Apr 16, 2020

Uh oh!

facebook-github-bot left a comment

Uh oh!

gchanan commented May 5, 2020

Uh oh!

kshitij12345 commented May 5, 2020

Uh oh!

facebook-github-bot left a comment

Uh oh!

kshitij12345 commented May 8, 2020

Uh oh!

cloudhan commented May 13, 2020

Uh oh!

kshitij12345 commented May 13, 2020

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented May 13, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Conversation

kshitij12345 commented Apr 15, 2020

Uh oh!

kshitij12345 commented Apr 15, 2020

Uh oh!

dr-ci Bot commented Apr 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

ci.pytorch.org: 1 failed

Uh oh!

Uh oh!

kshitij12345 commented Apr 16, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

gchanan commented May 5, 2020

Uh oh!

kshitij12345 commented May 5, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

kshitij12345 commented May 8, 2020

Uh oh!

cloudhan commented May 13, 2020

Uh oh!

kshitij12345 commented May 13, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented May 13, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

dr-ci Bot commented Apr 15, 2020 •

edited

Loading