-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-9331: [C++] Improve the performance of Tensor-to-SparseTensor conversion #7643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2448a2a to
5434bb2
Compare
|
@mrkn would you like someone to look at this? |
|
@wesm Yes, currently I’ve almost done for SparseCOOTensor. I think merging this before finishing for all the sparse format is better than nothing to be merged before 1.0. |
This change improves the conversion speed for all the cases of row-major and column-major tensors. For strided tensors, all the cases are improved except for the combination of int16 value and less than 32-bit index. The result from `archery benchmark diff` command is below, the baseline is the commit 8f96d1d (before merging apache#7539) and the contender is this commit: ``` benchmark baseline contender change % 43 Int16StridedTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 141564.498765 182313.374077 28.785 10 Int16StridedTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 140420.265077 153715.618715 9.468 42 Int16StridedTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 167601.944005 170626.538009 1.805 37 Int16StridedTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 143722.048451 141928.779114 -1.248 27 Int8StridedTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 169947.630903 164423.055008 -3.251 24 Int8StridedTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 170153.324442 163898.373534 -3.676 45 Int8StridedTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 170883.542468 164131.618700 -3.951 35 Int8StridedTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 171015.028153 163516.191034 -4.385 9 DoubleStridedTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 200974.675587 191956.688874 -4.487 18 FloatStridedTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 192320.819787 182941.130595 -4.877 12 DoubleStridedTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 175198.892973 166417.452194 -5.012 30 FloatStridedTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 167174.764713 151431.022906 -9.418 29 DoubleStridedTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 173925.990981 157142.110096 -9.650 16 FloatStridedTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 167877.497573 151666.610814 -9.656 26 FloatStridedTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 169705.312801 151885.952803 -10.500 6 DoubleStridedTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 177394.661870 156019.301906 -12.050 5 Int16RowMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 107592.839089 66069.770737 -38.593 41 Int16ColumnMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 114841.700196 68707.073774 -40.172 47 Int16RowMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 107304.436017 63922.898636 -40.428 4 FloatRowMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 112315.965200 66577.854744 -40.723 21 Int16ColumnMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 115090.317912 66527.852021 -42.195 17 FloatColumnMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 121583.540341 70025.614174 -42.405 3 DoubleRowMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 116946.572632 66411.338694 -43.212 15 FloatRowMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 112275.805149 63264.226406 -43.653 13 FloatColumnMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 122085.596559 66569.027159 -45.473 34 Int16RowMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 109888.801628 58860.826009 -46.436 20 Int16ColumnMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 117648.480324 62574.709433 -46.812 19 Int8ColumnMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 137444.576787 71969.132261 -47.638 28 DoubleRowMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 119527.435615 61405.371141 -48.627 40 FloatRowMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 115130.821188 58664.779831 -49.045 39 Int8ColumnMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 137053.503574 69755.112894 -49.104 22 Int8RowMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 136645.576795 69303.266896 -49.282 23 FloatColumnMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 124100.575779 61723.051518 -50.264 31 DoubleColumnMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 140278.467902 69584.530347 -50.395 1 Int16RowMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 135770.669563 67151.922438 -50.540 44 Int16ColumnMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 142625.928542 70315.759868 -50.699 2 Int8ColumnMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 137443.030096 67752.813535 -50.705 46 Int8RowMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 135961.160225 66613.351871 -51.006 11 DoubleColumnMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 138857.793332 67315.714410 -51.522 8 FloatRowMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 138992.703542 66847.061004 -51.906 7 Int8RowMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 136298.424804 64520.497064 -52.662 36 FloatColumnMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 149706.883716 69805.958679 -53.372 33 DoubleRowMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 143460.582904 66870.585026 -53.387 38 DoubleColumnMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 138220.367601 64425.776453 -53.389 14 DoubleRowMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 136707.421042 63624.050357 -53.460 25 Int8ColumnMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 137303.219403 62528.740787 -54.459 32 Int8RowMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 136551.052565 58743.141699 -56.981 0 DoubleColumnMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 162895.437265 69676.279783 -57.226 ```
|
@wesm I'm working on fixing the problems on VC++. Please wait a moment. |
f0cd933 to
de2efc3
Compare
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, benchmarks look good
https://gist.github.com/wesm/31425511bc6b787d9acc44b82127397f
I also checked that this doesn't affect binary size much
In this pull-request, the slowing down of the conversion introduced in #7539 is canceled, and the conversion speed is improved than before #7539 in some cases.