Skip to content

Migrate exp and exp_ from the TH to Aten (CUDA) #36652

Closed
kshitij12345 wants to merge 5 commits intopytorch:masterfrom
kshitij12345:migrate/exp_cuda
Closed

Migrate exp and exp_ from the TH to Aten (CUDA) #36652
kshitij12345 wants to merge 5 commits intopytorch:masterfrom
kshitij12345:migrate/exp_cuda

Conversation

@kshitij12345
Copy link
Copy Markdown
Collaborator

Closes #24561

Benchmark with same build settings on same system.
gcc : version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
CUDA : 10.1
GPU : 1050ti

import timeit

for n, t in [(10_000, 20000),
             (100_000, 20000)]:
    for dtype in ('torch.half', 'torch.float', 'torch.double'):
        print(f'torch.exp(a) a.numel() == {n} for {t} times {dtype}')
        print(timeit.timeit(f'torch.exp(a); torch.cuda.synchronize()',
                              setup=f'import torch; a=torch.arange({n}, dtype={dtype}, device="cuda")',
                              number=t))

Before:

torch.exp(a) a.numel() == 10000 for 20000 times torch.half
0.3001665159999902
torch.exp(a) a.numel() == 10000 for 20000 times torch.float
0.28265794499998265
torch.exp(a) a.numel() == 10000 for 20000 times torch.double
0.3432170909998149
torch.exp(a) a.numel() == 100000 for 20000 times torch.half
0.32273333800003456
torch.exp(a) a.numel() == 100000 for 20000 times torch.float
0.31498759600003723
torch.exp(a) a.numel() == 100000 for 20000 times torch.double
1.079708754999956

After:

torch.exp(a) a.numel() == 10000 for 20000 times torch.half
0.27996097300092515
torch.exp(a) a.numel() == 10000 for 20000 times torch.float
0.2774473429999489
torch.exp(a) a.numel() == 10000 for 20000 times torch.double
0.33066844799941464
torch.exp(a) a.numel() == 100000 for 20000 times torch.half
0.27641824200145493
torch.exp(a) a.numel() == 100000 for 20000 times torch.float
0.27805968599932385
torch.exp(a) a.numel() == 100000 for 20000 times torch.double
1.0644143180015817

@kshitij12345
Copy link
Copy Markdown
Collaborator Author

@VitalyFedyunin @ifedan @xuhdev Please review.

@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Apr 15, 2020

💊 CI failures summary and remediations

As of commit a5a29be (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 34 times.

Comment thread aten/src/THCUNN/THCHalfAutoNumerics.cuh Outdated
@kshitij12345
Copy link
Copy Markdown
Collaborator Author

@pytorchbot retest this please

@kshitij12345 kshitij12345 force-pushed the migrate/exp_cuda branch 2 times, most recently from 13c447e to ab045be Compare April 17, 2020 14:11
@ngimel ngimel requested a review from VitalyFedyunin April 18, 2020 06:08
@ngimel ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 18, 2020
Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@gchanan
Copy link
Copy Markdown
Contributor

gchanan commented May 5, 2020

can we not merge this until #37582 merges? Thanks!

@kshitij12345
Copy link
Copy Markdown
Collaborator Author

Sure. Will rebase once #37582 is in.

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@kshitij12345
Copy link
Copy Markdown
Collaborator Author

@VitalyFedyunin Gentle ping for reminder :)

@cloudhan
Copy link
Copy Markdown
Contributor

@kshitij12345 I think you should rebase? Since @VitalyFedyunin has approved, the bot will do the merge for you.

@kshitij12345
Copy link
Copy Markdown
Collaborator Author

@cloudhan Thanks will do that.

…24561

Benchmark with same build settings on same system.
gcc : version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
CUDA : 10.1
GPU : 1050ti

```python
import timeit

for n, t in [(10_000, 20000),
             (100_000, 20000)]:
    for dtype in ('torch.half', 'torch.float', 'torch.double'):
        print(f'torch.exp(a) a.numel() == {n} for {t} times {dtype}')
        print(timeit.timeit(f'torch.exp(a); torch.cuda.synchronize()',
                              setup=f'import torch; a=torch.arange({n}, dtype={dtype}, device="cuda")',
                              number=t))
```

Before:

```
torch.exp(a) a.numel() == 10000 for 20000 times torch.half
0.3001665159999902
torch.exp(a) a.numel() == 10000 for 20000 times torch.float
0.28265794499998265
torch.exp(a) a.numel() == 10000 for 20000 times torch.double
0.3432170909998149
torch.exp(a) a.numel() == 100000 for 20000 times torch.half
0.32273333800003456
torch.exp(a) a.numel() == 100000 for 20000 times torch.float
0.31498759600003723
torch.exp(a) a.numel() == 100000 for 20000 times torch.double
1.079708754999956
```

After:

```
torch.exp(a) a.numel() == 10000 for 20000 times torch.half
0.27996097300092515
torch.exp(a) a.numel() == 10000 for 20000 times torch.float
0.2774473429999489
torch.exp(a) a.numel() == 10000 for 20000 times torch.double
0.33066844799941464
torch.exp(a) a.numel() == 100000 for 20000 times torch.half
0.27641824200145493
torch.exp(a) a.numel() == 100000 for 20000 times torch.float
0.27805968599932385
torch.exp(a) a.numel() == 100000 for 20000 times torch.double
1.0644143180015817
```
Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@VitalyFedyunin merged this pull request in d86de91.

@kshitij12345 kshitij12345 deleted the migrate/exp_cuda branch May 27, 2020 15:47
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Closes pytorch#24561

Benchmark with same build settings on same system.
gcc : version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
CUDA : 10.1
GPU : 1050ti

```python
import timeit

for n, t in [(10_000, 20000),
             (100_000, 20000)]:
    for dtype in ('torch.half', 'torch.float', 'torch.double'):
        print(f'torch.exp(a) a.numel() == {n} for {t} times {dtype}')
        print(timeit.timeit(f'torch.exp(a); torch.cuda.synchronize()',
                              setup=f'import torch; a=torch.arange({n}, dtype={dtype}, device="cuda")',
                              number=t))
```

Before:

```
torch.exp(a) a.numel() == 10000 for 20000 times torch.half
0.3001665159999902
torch.exp(a) a.numel() == 10000 for 20000 times torch.float
0.28265794499998265
torch.exp(a) a.numel() == 10000 for 20000 times torch.double
0.3432170909998149
torch.exp(a) a.numel() == 100000 for 20000 times torch.half
0.32273333800003456
torch.exp(a) a.numel() == 100000 for 20000 times torch.float
0.31498759600003723
torch.exp(a) a.numel() == 100000 for 20000 times torch.double
1.079708754999956
```

After:

```
torch.exp(a) a.numel() == 10000 for 20000 times torch.half
0.27996097300092515
torch.exp(a) a.numel() == 10000 for 20000 times torch.float
0.2774473429999489
torch.exp(a) a.numel() == 10000 for 20000 times torch.double
0.33066844799941464
torch.exp(a) a.numel() == 100000 for 20000 times torch.half
0.27641824200145493
torch.exp(a) a.numel() == 100000 for 20000 times torch.float
0.27805968599932385
torch.exp(a) a.numel() == 100000 for 20000 times torch.double
1.0644143180015817
```
Pull Request resolved: pytorch#36652

Differential Revision: D21164653

Pulled By: VitalyFedyunin

fbshipit-source-id: 42c7b24b0d85ff1d390231f1457968a8869b8db3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate exp and exp_ from the TH to Aten (CUDA)

9 participants