Remove SSE-only code and convolve5x5 by cpuhrsch · Pull Request #12109 · pytorch/pytorch

cpuhrsch · 2018-09-26T18:49:52Z

Performance oriented code will use AVX/AVX2, so we don't need SSE specific code anymore. This will also reduce the probability of running into an error on legacy CPUs.

On top of this convolve is covered by modern libraries such as MKLDNN, which are much more performant and which we now build against by default (even for builds from source).

cpuhrsch · 2018-09-26T20:00:52Z

This is the original PR that got convolve into TH: torch/torch7#241

facebook-github-bot

cpuhrsch has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

colesbury

I think you need to change cmake/Dependencies.cmake which uses FindSSE.cmake

It would also be good to change C_AVX_FOUND, etc. to check if the compiler supports AVX instead of if the system can run AVX instructions.

colesbury

Can you change the message "AVX found" in cmake/Dependencies.cmake to something like "AVX compiler support found" or something similar?

cpuhrsch · 2018-09-26T21:21:24Z

@colesbury in addition to "COMPILER_SUPPORTS_AVX2" or "CXX_HAS_AVX2_2" or "CAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS" ;)

facebook-github-bot

cpuhrsch has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

cpuhrsch has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

colesbury has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Previously, we were only enabling Flush-To-Zero (FTZ) and Denormals-Are-Zero (DAZ) when compiling with SSE3 enabled. After, Christian's patch (pytorch#12109) we won't be compiling core files with SSE3 or SSE4 enabled, to better support older AMD processors. This moves the FTZ and DAZ code behind a runtime CPU check in preparation for that change.

Summary: Previously, we were only enabling Flush-To-Zero (FTZ) and Denormals-Are-Zero (DAZ) when compiling with SSE3 enabled. After, Christian's patch (#12109) we won't be compiling core files with SSE3 or SSE4 enabled, to better support older AMD processors. This moves the FTZ and DAZ code behind a runtime CPU check in preparation for that change. Pull Request resolved: #12386 Differential Revision: D10222237 Pulled By: colesbury fbshipit-source-id: 7ffe32561ab965e1e5f9eb6e679602bbf4775538

Summary: Previously, we were only enabling Flush-To-Zero (FTZ) and Denormals-Are-Zero (DAZ) when compiling with SSE3 enabled. After, Christian's patch (pytorch/pytorch#12109) we won't be compiling core files with SSE3 or SSE4 enabled, to better support older AMD processors. This moves the FTZ and DAZ code behind a runtime CPU check in preparation for that change. Pull Request resolved: pytorch/pytorch#12386 Differential Revision: D10222237 Pulled By: colesbury fbshipit-source-id: 7ffe32561ab965e1e5f9eb6e679602bbf4775538

facebook-github-bot

colesbury has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

colesbury has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: Performance oriented code will use AVX/AVX2, so we don't need SSE specific code anymore. This will also reduce the probability of running into an error on legacy CPUs. On top of this convolve is covered by modern libraries such as MKLDNN, which are much more performant and which we now build against by default (even for builds from source). Pull Request resolved: pytorch/pytorch#12109 Differential Revision: D10055134 Pulled By: colesbury fbshipit-source-id: 789b8a34d5936d9c144bcde410c30f7eb1c776fa

Summary: Performance oriented code will use AVX/AVX2, so we don't need SSE specific code anymore. This will also reduce the probability of running into an error on legacy CPUs. On top of this convolve is covered by modern libraries such as MKLDNN, which are much more performant and which we now build against by default (even for builds from source). Pull Request resolved: pytorch#12109 Differential Revision: D10055134 Pulled By: colesbury fbshipit-source-id: 789b8a34d5936d9c144bcde410c30f7eb1c776fa

cpuhrsch requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners September 26, 2018 18:49

cpuhrsch changed the title ~~Getting ride of SSE-only code~~ [WIP] Getting ride of SSE-only code Sep 26, 2018

cpuhrsch changed the title ~~[WIP] Getting ride of SSE-only code~~ [WIP] Getting ride of SSE-only code and convolve5x5 Sep 26, 2018

cpuhrsch force-pushed the nonvolve1 branch from 650cd96 to 32b3fda Compare September 26, 2018 19:41

cpuhrsch force-pushed the nonvolve1 branch from 32b3fda to 7cccb20 Compare September 26, 2018 20:10

facebook-github-bot reviewed Sep 26, 2018

View reviewed changes

colesbury reviewed Sep 26, 2018

View reviewed changes

cpuhrsch force-pushed the nonvolve1 branch from 7cccb20 to f44d5e3 Compare September 26, 2018 21:06

colesbury approved these changes Sep 26, 2018

View reviewed changes

cpuhrsch mentioned this pull request Sep 26, 2018

massive test failures with SIGILL on old P6200 CPU with pytorch-0.4.1 built from source #11988

Closed

Getting ride of SSE-only code and convolve5x5

891a3f5

cpuhrsch force-pushed the nonvolve1 branch from f44d5e3 to 891a3f5 Compare September 26, 2018 21:30

facebook-github-bot reviewed Sep 27, 2018

View reviewed changes

colesbury added 2 commits October 5, 2018 08:25

Merge branch 'master' into nonvolve1

6e86b59

Update AVX message

22924d6

facebook-github-bot reviewed Oct 5, 2018

View reviewed changes

colesbury changed the title ~~[WIP] Getting ride of SSE-only code and convolve5x5~~ Remove SSE-only code and convolve5x5 Oct 5, 2018

colesbury mentioned this pull request Oct 5, 2018

Illegal instruction (core dumped) for Cuda in 1.0.0.dev #12300

Closed

colesbury mentioned this pull request Oct 5, 2018

Guard Denormals-Are-Zero with runtime CPU check #12386

Closed

Merge branch 'master' into nonvolve1

a39de8e

facebook-github-bot reviewed Oct 8, 2018

View reviewed changes

Remove dead code

9df4f18

facebook-github-bot reviewed Oct 8, 2018

View reviewed changes

facebook-github-bot closed this in f564163 Oct 9, 2018

ezyang added the merged label Jun 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove SSE-only code and convolve5x5#12109

Remove SSE-only code and convolve5x5#12109
cpuhrsch wants to merge 5 commits intopytorch:masterfrom
cpuhrsch:nonvolve1

cpuhrsch commented Sep 26, 2018 •

edited

Loading

Uh oh!

cpuhrsch commented Sep 26, 2018

Uh oh!

facebook-github-bot left a comment

Uh oh!

colesbury left a comment

Uh oh!

colesbury left a comment

Uh oh!

cpuhrsch commented Sep 26, 2018 •

edited

Loading

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

cpuhrsch commented Sep 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cpuhrsch commented Sep 26, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

colesbury left a comment

Choose a reason for hiding this comment

Uh oh!

colesbury left a comment

Choose a reason for hiding this comment

Uh oh!

cpuhrsch commented Sep 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cpuhrsch commented Sep 26, 2018 •

edited

Loading

cpuhrsch commented Sep 26, 2018 •

edited

Loading