ATen Unary Ops#6030
Conversation
a4d60ce to
7df4b46
Compare
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
Running the perf test 10x longer (10x more counts). |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
Nice speed-ups! |
|
The GPU perf test result can be ignored (the margin of error was too small) |
e83934a to
9e9af7c
Compare
Implements a few unary operations for which there are AVX intrinsics. The perf comparison script is here: https://paste.fedoraproject.org/paste/f1adcJhpGtzDNWImS34XzQ
Implements a few unary operations for which there are AVX intrinsics.
The perf comparison script is here.
EDIT: Appended the last column of "This branch" timings for readability
EDIT: Removed clone to become part of a larger effort around copy
EDIT: I will also add log, exp, cos, sin to this PR
EDIT: Strided case and log, exp, cos, sin will be handled in another PR
EDIT: Includes better formatting
Command - single core
Compiled with gcc5.4.0 for python2.7
Master
Command manycore (20 cores)
Master