Port `equal` from THC to ATen (CUDA) by Baranowski · Pull Request #36483 · pytorch/pytorch

Baranowski · 2020-04-13T13:23:33Z

ASV benchmark:

import torch

sizes = [
    (10**6,),
    (1000, 1000),
    (10, 10),
    (1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
]

class EqualTrue:
    params = range(len(sizes))

    def setup(self, n):
        dims = sizes[n]
        self.a = torch.rand(dims, device='cuda')
        self.b = self.a.clone()

    def time_equal(self, n):
        torch.equal(self.a, self.b)

class EqualFalse:
    params = range(len(sizes))

    def setup(self, n):
        dims = sizes[n]
        self.a = torch.rand(dims, device='cuda')
        self.b = torch.rand(dims, device='cuda')

    def time_equal(self, n):
        torch.equal(self.a, self.b)

Old results:

[ 75.00%] ··· equal.EqualFalse.time_equal
[ 75.00%] ··· ======== ============
               param1
              -------- ------------
                 0       67.7±7μs
                 1       74.0±2μs
                 2      24.4±0.1μs
                 3      135±0.2μs
              ======== ============

[100.00%] ··· equal.EqualTrue.time_equal
[100.00%] ··· ======== ============
               param1
              -------- ------------
                 0      59.8±0.2μs
                 1      59.9±0.3μs
                 2      25.0±0.5μs
                 3      136±0.2μs
              ======== ============

New results:

[ 75.00%] ··· equal.EqualFalse.time_equal
[ 75.00%] ··· ======== ============
               param1              
              -------- ------------
                 0      44.4±0.2μs 
                 1      44.5±0.4μs 
                 2      31.3±0.3μs 
                 3      96.6±0.5μs 
              ======== ============

[100.00%] ··· equal.EqualTrue.time_equal
[100.00%] ··· ======== ============
               param1              
              -------- ------------
                 0      44.2±0.2μs 
                 1      44.6±0.2μs 
                 2      30.8±0.3μs 
                 3      97.3±0.2μs 
              ======== ============

dr-ci · 2020-04-13T13:24:32Z

💊 CI failures summary and remediations

As of commit b80a23b (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 24 times.

Baranowski · 2020-04-13T13:25:25Z

I don't see anything for equal in the benchmarks/ folder. Any suggestions on what usage patterns I should benchmark for?

Baranowski · 2020-04-21T13:12:27Z

@VitalyFedyunin should I benchmark this change?

VitalyFedyunin · 2020-04-21T19:03:38Z

Hi! Yes, simple timeit benchmark with different tensor shapes(sizes) will suffice.

Baranowski · 2020-04-22T08:14:10Z

I wasn't expecting this in such a simple change but I'm seeing a perf regression. I will need to dig into this.

Please ping me when you are ready for next review.

Baranowski

@VitalyFedyunin this is ready for a review.

I have put the benchmark results and the benchmark itself in the PR description. Turns out that wrapping the value in a scalar_tensor and then extracting it from there was taking too long.

Baranowski · 2020-04-23T08:13:08Z

I'm not sure that calling this function from ATen (as opposed to TH) is ok.

VitalyFedyunin

Code looks good, but you need to rebase

Baranowski · 2020-05-01T07:32:04Z

@VitalyFedyunin this is rebased and should be ready to be merged.

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

VitalyFedyunin · 2020-05-07T16:32:29Z

It is safer if you call: at::native::eq(self, src).min()

That leads to serious performance regression.

Sorry for going circles but can you please check with at::native::eq(self, src).all() https://pytorch.org/docs/master/tensors.html#torch.BoolTensor.all

I'm shocked. Not only is there no regression, it seems to be significantly faster now. (Updated the benchmarks in PR message)

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@VitalyFedyunin is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

VitalyFedyunin · 2020-06-22T20:23:01Z

Sorry this one got lost in notifications. Can you please rebase so I can land.

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Baranowski · 2020-07-03T07:01:50Z

@VitalyFedyunin I have rebased again. This is ready for merge

Baranowski · 2020-07-06T19:46:28Z

Just a heads up: I won't have the time to work on this after two weeks from now. I think this should be ready to merge.

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-07-07T00:11:35Z

@VitalyFedyunin merged this pull request in a780244.

Summary: Fixes pytorch#24557 ASV benchmark: ``` import torch sizes = [ (10**6,), (1000, 1000), (10, 10), (1, 2, 3, 4, 5, 6, 7, 8, 9, 10), ] class EqualTrue: params = range(len(sizes)) def setup(self, n): dims = sizes[n] self.a = torch.rand(dims, device='cuda') self.b = self.a.clone() def time_equal(self, n): torch.equal(self.a, self.b) class EqualFalse: params = range(len(sizes)) def setup(self, n): dims = sizes[n] self.a = torch.rand(dims, device='cuda') self.b = torch.rand(dims, device='cuda') def time_equal(self, n): torch.equal(self.a, self.b) ``` Old results: ``` [ 75.00%] ··· equal.EqualFalse.time_equal [ 75.00%] ··· ======== ============ param1 -------- ------------ 0 67.7±7μs 1 74.0±2μs 2 24.4±0.1μs 3 135±0.2μs ======== ============ [100.00%] ··· equal.EqualTrue.time_equal [100.00%] ··· ======== ============ param1 -------- ------------ 0 59.8±0.2μs 1 59.9±0.3μs 2 25.0±0.5μs 3 136±0.2μs ======== ============ ``` New results: ``` [ 75.00%] ··· equal.EqualFalse.time_equal [ 75.00%] ··· ======== ============ param1 -------- ------------ 0 44.4±0.2μs 1 44.5±0.4μs 2 31.3±0.3μs 3 96.6±0.5μs ======== ============ [100.00%] ··· equal.EqualTrue.time_equal [100.00%] ··· ======== ============ param1 -------- ------------ 0 44.2±0.2μs 1 44.6±0.2μs 2 30.8±0.3μs 3 97.3±0.2μs ======== ============ ``` Pull Request resolved: pytorch#36483 Differential Revision: D21451829 Pulled By: VitalyFedyunin fbshipit-source-id: 033e8060192c54f139310aeafe8ba784bab94ded

This reverts commit ca91846.

Summary: Fixes pytorch#24557 ASV benchmark: ``` import torch sizes = [ (10**6,), (1000, 1000), (10, 10), (1, 2, 3, 4, 5, 6, 7, 8, 9, 10), ] class EqualTrue: params = range(len(sizes)) def setup(self, n): dims = sizes[n] self.a = torch.rand(dims, device='cuda') self.b = self.a.clone() def time_equal(self, n): torch.equal(self.a, self.b) class EqualFalse: params = range(len(sizes)) def setup(self, n): dims = sizes[n] self.a = torch.rand(dims, device='cuda') self.b = torch.rand(dims, device='cuda') def time_equal(self, n): torch.equal(self.a, self.b) ``` Old results: ``` [ 75.00%] ··· equal.EqualFalse.time_equal [ 75.00%] ··· ======== ============ param1 -------- ------------ 0 67.7±7μs 1 74.0±2μs 2 24.4±0.1μs 3 135±0.2μs ======== ============ [100.00%] ··· equal.EqualTrue.time_equal [100.00%] ··· ======== ============ param1 -------- ------------ 0 59.8±0.2μs 1 59.9±0.3μs 2 25.0±0.5μs 3 136±0.2μs ======== ============ ``` New results: ``` [ 75.00%] ··· equal.EqualFalse.time_equal [ 75.00%] ··· ======== ============ param1 -------- ------------ 0 44.4±0.2μs 1 44.5±0.4μs 2 31.3±0.3μs 3 96.6±0.5μs ======== ============ [100.00%] ··· equal.EqualTrue.time_equal [100.00%] ··· ======== ============ param1 -------- ------------ 0 44.2±0.2μs 1 44.6±0.2μs 2 30.8±0.3μs 3 97.3±0.2μs ======== ============ ``` Pull Request resolved: pytorch#36483 Differential Revision: D21451829 Pulled By: VitalyFedyunin fbshipit-source-id: 033e8060192c54f139310aeafe8ba784bab94ded

Baranowski requested review from VitalyFedyunin and anjali411 April 13, 2020 13:24

pytorchbot added the open source label Apr 13, 2020

mruberry added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 18, 2020

VitalyFedyunin previously approved these changes Apr 21, 2020

View reviewed changes

Baranowski force-pushed the wbaranowski-equal-24557 branch from 7973bd6 to df748f0 Compare April 22, 2020 18:48

Baranowski commented Apr 23, 2020

View reviewed changes

VitalyFedyunin approved these changes Apr 29, 2020

View reviewed changes

Baranowski force-pushed the wbaranowski-equal-24557 branch from df748f0 to 18fd4ea Compare April 30, 2020 07:09

facebook-github-bot reviewed May 7, 2020

View reviewed changes

VitalyFedyunin reviewed May 7, 2020

View reviewed changes

Baranowski force-pushed the wbaranowski-equal-24557 branch from 81237c6 to a86fbf4 Compare May 13, 2020 19:26

VitalyFedyunin reviewed May 13, 2020

View reviewed changes

Comment thread aten/src/ATen/native/cuda/ReduceLogicKernel.cu Outdated

facebook-github-bot reviewed May 13, 2020

View reviewed changes

facebook-github-bot reviewed May 18, 2020

View reviewed changes

glaringlee mentioned this pull request May 20, 2020

Migrate equal from the TH to Aten (CPU) #33286

Closed

Baranowski force-pushed the wbaranowski-equal-24557 branch from e414a86 to f8d1304 Compare June 23, 2020 05:55

facebook-github-bot reviewed Jun 23, 2020

View reviewed changes

equal TH -> ATen (CUDA)

b80a23b

Baranowski force-pushed the wbaranowski-equal-24557 branch from f8d1304 to b80a23b Compare July 2, 2020 06:48

facebook-github-bot reviewed Jul 6, 2020

View reviewed changes

facebook-github-bot closed this in a780244 Jul 6, 2020

facebook-github-bot added the merged label Jul 7, 2020

csarofeen added a commit to csarofeen/pytorch that referenced this pull request Aug 16, 2020

Revert "Port equal from THC to ATen (CUDA) (pytorch#36483)"

6bea5ee

This reverts commit ca91846.

CedricPicron mentioned this pull request Sep 15, 2020

Torch.equal(t1, t2) slower than torch.eq(t1, t2).all() on GPU #44707

Closed

mruberry added the Merged label Oct 28, 2020

Conversation

Baranowski commented Apr 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci Bot commented Apr 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

Baranowski commented Apr 13, 2020

Uh oh!

Baranowski commented Apr 21, 2020

Uh oh!

VitalyFedyunin commented Apr 21, 2020

Uh oh!

Baranowski commented Apr 22, 2020

Uh oh!

Baranowski left a comment

Choose a reason for hiding this comment

Uh oh!

Baranowski Apr 23, 2020

Choose a reason for hiding this comment

Uh oh!

VitalyFedyunin left a comment

Choose a reason for hiding this comment

Uh oh!

Baranowski commented May 1, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

VitalyFedyunin May 7, 2020

Choose a reason for hiding this comment

Uh oh!

Baranowski May 9, 2020

Choose a reason for hiding this comment

Uh oh!

VitalyFedyunin May 13, 2020

Choose a reason for hiding this comment

Uh oh!

Baranowski May 13, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

VitalyFedyunin commented Jun 22, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Baranowski commented Jul 3, 2020

Uh oh!

Baranowski commented Jul 6, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Baranowski commented Apr 13, 2020 •

edited

Loading

dr-ci Bot commented Apr 13, 2020 •

edited

Loading