ARROW-9331: [C++] Improve the performance of Tensor-to-SparseTensor conversion #7643

mrkn · 2020-07-06T14:05:29Z

In this pull-request, the slowing down of the conversion introduced in #7539 is canceled, and the conversion speed is improved than before #7539 in some cases.

github-actions · 2020-07-06T14:48:34Z

https://issues.apache.org/jira/browse/ARROW-9331

wesm · 2020-07-09T22:11:01Z

@mrkn would you like someone to look at this?

mrkn · 2020-07-09T22:42:05Z

@wesm Yes, currently I’ve almost done for SparseCOOTensor. I think merging this before finishing for all the sparse format is better than nothing to be merged before 1.0.

This change improves the conversion speed for all the cases of row-major and column-major tensors. For strided tensors, all the cases are improved except for the combination of int16 value and less than 32-bit index. The result from `archery benchmark diff` command is below, the baseline is the commit 8f96d1d (before merging apache#7539) and the contender is this commit: ``` benchmark baseline contender change % 43 Int16StridedTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 141564.498765 182313.374077 28.785 10 Int16StridedTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 140420.265077 153715.618715 9.468 42 Int16StridedTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 167601.944005 170626.538009 1.805 37 Int16StridedTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 143722.048451 141928.779114 -1.248 27 Int8StridedTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 169947.630903 164423.055008 -3.251 24 Int8StridedTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 170153.324442 163898.373534 -3.676 45 Int8StridedTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 170883.542468 164131.618700 -3.951 35 Int8StridedTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 171015.028153 163516.191034 -4.385 9 DoubleStridedTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 200974.675587 191956.688874 -4.487 18 FloatStridedTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 192320.819787 182941.130595 -4.877 12 DoubleStridedTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 175198.892973 166417.452194 -5.012 30 FloatStridedTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 167174.764713 151431.022906 -9.418 29 DoubleStridedTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 173925.990981 157142.110096 -9.650 16 FloatStridedTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 167877.497573 151666.610814 -9.656 26 FloatStridedTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 169705.312801 151885.952803 -10.500 6 DoubleStridedTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 177394.661870 156019.301906 -12.050 5 Int16RowMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 107592.839089 66069.770737 -38.593 41 Int16ColumnMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 114841.700196 68707.073774 -40.172 47 Int16RowMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 107304.436017 63922.898636 -40.428 4 FloatRowMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 112315.965200 66577.854744 -40.723 21 Int16ColumnMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 115090.317912 66527.852021 -42.195 17 FloatColumnMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 121583.540341 70025.614174 -42.405 3 DoubleRowMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 116946.572632 66411.338694 -43.212 15 FloatRowMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 112275.805149 63264.226406 -43.653 13 FloatColumnMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 122085.596559 66569.027159 -45.473 34 Int16RowMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 109888.801628 58860.826009 -46.436 20 Int16ColumnMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 117648.480324 62574.709433 -46.812 19 Int8ColumnMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 137444.576787 71969.132261 -47.638 28 DoubleRowMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 119527.435615 61405.371141 -48.627 40 FloatRowMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 115130.821188 58664.779831 -49.045 39 Int8ColumnMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 137053.503574 69755.112894 -49.104 22 Int8RowMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 136645.576795 69303.266896 -49.282 23 FloatColumnMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 124100.575779 61723.051518 -50.264 31 DoubleColumnMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 140278.467902 69584.530347 -50.395 1 Int16RowMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 135770.669563 67151.922438 -50.540 44 Int16ColumnMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 142625.928542 70315.759868 -50.699 2 Int8ColumnMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 137443.030096 67752.813535 -50.705 46 Int8RowMajorTensorConversionFixture<Int16Type>/ConvertToSparseCOOTensorInt16 135961.160225 66613.351871 -51.006 11 DoubleColumnMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 138857.793332 67315.714410 -51.522 8 FloatRowMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 138992.703542 66847.061004 -51.906 7 Int8RowMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 136298.424804 64520.497064 -52.662 36 FloatColumnMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 149706.883716 69805.958679 -53.372 33 DoubleRowMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 143460.582904 66870.585026 -53.387 38 DoubleColumnMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 138220.367601 64425.776453 -53.389 14 DoubleRowMajorTensorConversionFixture<Int32Type>/ConvertToSparseCOOTensorInt32 136707.421042 63624.050357 -53.460 25 Int8ColumnMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 137303.219403 62528.740787 -54.459 32 Int8RowMajorTensorConversionFixture<Int64Type>/ConvertToSparseCOOTensorInt64 136551.052565 58743.141699 -56.981 0 DoubleColumnMajorTensorConversionFixture<Int8Type>/ConvertToSparseCOOTensorInt8 162895.437265 69676.279783 -57.226 ```

mrkn · 2020-07-10T00:14:41Z

@wesm I'm working on fixing the problems on VC++. Please wait a moment.

wesm

+1, benchmarks look good

https://gist.github.com/wesm/31425511bc6b787d9acc44b82127397f

I also checked that this doesn't affect binary size much

mrkn force-pushed the ARROW-9331 branch 2 times, most recently from 2448a2a to 5434bb2 Compare July 6, 2020 14:16

mrkn changed the title ~~ARROW-9331: [C++] Improve the conversion performance from Tensor to SparseCOOTensor~~ ARROW-9331: [C++] Improve the performance of Tensor-to-SparseTensor conversion Jul 6, 2020

mrkn force-pushed the ARROW-9331 branch from 5434bb2 to f3ebb06 Compare July 6, 2020 14:44

mrkn added 2 commits July 10, 2020 09:00

Rewrite DISPATCH macro

1609bf7

mrkn force-pushed the ARROW-9331 branch from f3ebb06 to 1609bf7 Compare July 10, 2020 00:01

Fix format

80c4973

mrkn force-pushed the ARROW-9331 branch 5 times, most recently from f0cd933 to de2efc3 Compare July 10, 2020 02:06

Fix for VC++

aa836ad

mrkn force-pushed the ARROW-9331 branch from de2efc3 to aa836ad Compare July 10, 2020 02:27

mrkn marked this pull request as ready for review July 10, 2020 03:28

mrkn requested a review from wesm July 10, 2020 03:28

wesm approved these changes Jul 11, 2020

View reviewed changes

wesm closed this in 18a5e3e Jul 11, 2020

mrkn deleted the ARROW-9331 branch July 13, 2020 22:31

asfimport mentioned this pull request Jul 11, 2020

[C++] Improve the performance of Tensor-to-SparseTensor conversion #25416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ARROW-9331: [C++] Improve the performance of Tensor-to-SparseTensor conversion #7643

ARROW-9331: [C++] Improve the performance of Tensor-to-SparseTensor conversion #7643

Uh oh!

mrkn commented Jul 6, 2020

Uh oh!

github-actions bot commented Jul 6, 2020

Uh oh!

wesm commented Jul 9, 2020

Uh oh!

mrkn commented Jul 9, 2020

Uh oh!

mrkn commented Jul 10, 2020

Uh oh!

wesm left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ARROW-9331: [C++] Improve the performance of Tensor-to-SparseTensor conversion #7643

ARROW-9331: [C++] Improve the performance of Tensor-to-SparseTensor conversion #7643

Uh oh!

Conversation

mrkn commented Jul 6, 2020

Uh oh!

github-actions bot commented Jul 6, 2020

Uh oh!

wesm commented Jul 9, 2020

Uh oh!

mrkn commented Jul 9, 2020

Uh oh!

mrkn commented Jul 10, 2020

Uh oh!

wesm left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants