Sparse benchmarking utils by aocsa · Pull Request #48397 · pytorch/pytorch

aocsa · 2020-11-23T15:57:08Z

This is a benchmarking tooling to work with sparse tensors. To implement this, we extended PR benchmarking util #38338 for sparse tensors. In order to extend the proposed utility library the FuzzedTensor class was extended by creating the new FuzzedSparseTensor class. In addition two new operator classes were added, the UnaryOpSparseFuzzer and BinaryOpSparseFuzzer.

The class FuzzedSparseTensor adds new input parameters to the constructor:

sparse_dim: The number of sparse dimensions in a sparse tensor.
nnz: Number of non-zero elements in the sparse tensor.
density: The density of the sparse tensor.
coalesced: As we know the sparse tensor format permits coalesced/uncoalesced sparse tensors.

and removes probability_contiguous, max_allocation_bytes, roll_parameter, tensor_constructor as they are dense-tensors related parameters.

In addition, I've extended the torch.utils.benchmark.examples to work with the new classes FuzzedSparseTensor, UnaryOpSparseFuzzer and BinaryOpSparseFuzzer.

Hopefully, this tooling and these examples will help to make other benchmarks in other PRs. Looking forward to your thoughts and feedback. cc @robieta, @mruberry, @ngimel

dr-ci · 2020-11-23T16:09:19Z

💊 CI failures summary and remediations

As of commit ed680bd (more details on the Dr. CI page):

1/2 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)
1/2 tentatively recognized as flaky ❄️
- Click here to rerun these jobs

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) ❄️

Jan 14 07:24:16 RuntimeError: CUDA error: an illegal memory access was encountered

Jan 14 07:24:16                        ~~~~ <--- HERE
Jan 14 07:24:16 RuntimeError: CUDA error: an illegal memory access was encountered
Jan 14 07:24:16 
Jan 14 07:24:16 
Jan 14 07:24:16 ======================================================================
Jan 14 07:24:16 ERROR [0.204s]: test_where_and_typing (__main__.TestTEFuser)
Jan 14 07:24:16 ----------------------------------------------------------------------
Jan 14 07:24:16 Traceback (most recent call last):
Jan 14 07:24:16   File "test_jit_fuser_te.py", line 1142, in test_where_and_typing
Jan 14 07:24:16     x = torch.randn(4, 4, dtype=torch.double, device=device)
Jan 14 07:24:16 RuntimeError: CUDA error: an illegal memory access was encountered
Jan 14 07:24:16 
Jan 14 07:24:16 ======================================================================
Jan 14 07:24:16 ERROR [0.189s]: test_zero_element_tensors_cuda (__main__.TestTEFuser)
Jan 14 07:24:16 ----------------------------------------------------------------------
Jan 14 07:24:16 Traceback (most recent call last):
Jan 14 07:24:16   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 888, in wrapper
Jan 14 07:24:16     method(*args, **kwargs)
Jan 14 07:24:16   File "test_jit_fuser_te.py", line 178, in test_zero_element_tensors_cuda
Jan 14 07:24:16     self._test_zero_element_tensors(device="cuda")
Jan 14 07:24:16   File "test_jit_fuser_te.py", line 174, in _test_zero_element_tensors

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm3.10-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

codecov · 2020-11-23T20:49:47Z

Codecov Report

Merging #48397 (e4e153e) into master (19f4c51) will decrease coverage by 0.07%.
The diff coverage is 4.92%.

@@            Coverage Diff             @@
##           master   #48397      +/-   ##
==========================================
- Coverage   80.77%   80.69%   -0.08%     
==========================================
  Files        1867     1860       -7     
  Lines      201583   200202    -1381     
==========================================
- Hits       162819   161558    -1261     
+ Misses      38764    38644     -120

ngimel

I feel like reusable parts of this PR are BinaryOpSparseFuzzer and UnaryOpSparseFuzzer, compare.py and fuzzer.py don't really add context compared to their non-sparse counterparts. For those, you probably want to specify densities and not nnzs though.
If you want to use fuzzers to benchmark matmuls, you probably will need to create matmul sparse fuzzer. In the near term, since we probably are not going to focus on benchmarking unary and binary ops, working on the infrastructure for matmuls is more valuable.

ngimel · 2020-12-15T22:52:30Z

it looks like compare.py and fuzzer.py are not adding a lot to the existing dense examples of the same name.

fuzzer.py is just an and example about how to use the new FuzzedSparseTensor. In the other hand compare.py is an example of how to bench matmul operation over sparse matrices. A more complete benchmark reference for matmul operations was submitted previously in a different PR https://github.com/pytorch/pytorch/blob/master/benchmarks/sparse/matmul_dlmc_bench.py. This benchmark uses a popular and real sparse matrix dataset https://github.com/google-research/google-research/tree/master/sgk.

update sparse bench minor refactor and tests updated minor update minor update 2

aocsa · 2021-01-14T04:47:18Z

I addressed review comments.

I feel like reusable parts of this PR are BinaryOpSparseFuzzer and UnaryOpSparseFuzzer, compare.py and fuzzer.py don't really add context compared to their non-sparse counterparts. For those, you probably want to specify densities and not nnzs though.

If you want to use fuzzers to benchmark matmuls, you probably will need to create matmul sparse fuzzer. In the near term, since we probably are not going to focus on benchmarking unary and binary ops, working on the infrastructure for matmuls is more valuable.

Regarding this request, I previously worked doing sparse matmul benchmarks using a real sparse matrix dataset. Would you think necessary to use a synthetic dataset implementing a matmul sparse fuzzer instead of using a real dataset?

@ngimel, @mruberry - looking forward your comments.

mruberry · 2021-01-21T23:22:32Z

I addressed review comments.

I feel like reusable parts of this PR are BinaryOpSparseFuzzer and UnaryOpSparseFuzzer, compare.py and fuzzer.py don't really add context compared to their non-sparse counterparts. For those, you probably want to specify densities and not nnzs though.

If you want to use fuzzers to benchmark matmuls, you probably will need to create matmul sparse fuzzer. In the near term, since we probably are not going to focus on benchmarking unary and binary ops, working on the infrastructure for matmuls is more valuable.

Regarding this request, I previously worked doing sparse matmul benchmarks using a real sparse matrix dataset. Would you think necessary to use a synthetic dataset implementing a matmul sparse fuzzer instead of using a real dataset?

@ngimel, @mruberry - looking forward your comments.

A real dataset would be great!

mruberry

Cool, thanks @aocsa!

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-01-22T17:45:08Z

@mruberry merged this pull request in 002d978.

Summary: Minor PR following up the previous PR about sparse benchmarking utils #48397 Fixes #44634: Performance benchmarks for matrix-matrix and matrix-vector ops (dense-sparse, sparse-sparse, and compare to dense-dense) I ran all benchmarks on an 2xRTX8000 machine with AMD 2970WX 24-cores for `DLMC/magnitude_pruning` dataset with different sparsity levels. --- <details><summary> forward tests (expand for details). </summary> - `sparse@sparse` ``` [------------------------------- cpu:matmul-forward -------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------- torch:dense@dense | 108.1 | 100.5 | 101.3 | 108.4 | 98.4 | 187.4 torch:sparse@sparse | 659.1 | 368.8 | 156.5 | 53.3 | 26.8 | 14.9 scipy:sparse@sparse | 565.1 | 233.9 | 130.2 | 23.1 | 21.6 | 15.2 Times are in milliseconds (ms). [----------------------------------- cuda:matmul-forward -----------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ---------------------------------------------------------------------------------- torch:dense@dense | 2243.5 | 4392.5 | 4419.8 | 2272.3 | 4433.9 | 8920.1 torch:sparse@sparse | 21369.2 | 11877.6 | 7339.2 | 1787.2 | 1335.1 | 845.7 Times are in microseconds (us). ``` - `sparse@dense` ``` [------------------------------- cpu:matmul-forward -------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------- torch:dense@dense | 105.8 | 103.8 | 103.0 | 104.4 | 104.4 | 197.0 torch:sparse@dense | 119.9 | 102.4 | 84.0 | 19.7 | 16.8 | 11.6 scipy:sparse@dense | 906.5 | 799.6 | 697.8 | 182.2 | 165.5 | 135.4 Times are in milliseconds (ms). [------------------------- cuda:matmul-forward --------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: --------------------------------------------------------------- torch:dense@dense | 2.2 | 4.4 | 4.4 | 2.3 | 4.5 | 2.3 torch:sparse@dense | 5.7 | 6.6 | 4.5 | 1.4 | 1.4 | 1.3 Times are in milliseconds (ms). ``` - `sparse@vector` ``` [----------------------------------- cpu:matmul-forward ----------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: -------------------------------------------------------------------------------- torch:dense@vector | 510.6 | 505.8 | 759.6 | 782.1 | 682.4 | 764.6 torch:sparse@vector | 10122.8 | 6241.1 | 7935.6 | 2076.3 | 1049.5 | 826.3 scipy:sparse@vector | 1756.7 | 1033.9 | 678.2 | 343.5 | 168.5 | 65.4 Times are in microseconds (us). [-------------------------------- cuda:matmul-forward --------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ---------------------------------------------------------------------------- torch:dense@vector | 36.1 | 21.5 | 21.6 | 21.5 | 21.6 | 21.5 torch:sparse@vector | 1099.2 | 1289.4 | 775.7 | 327.1 | 285.4 | 274.0 Times are in microseconds (us). ``` </details> --- <details><summary> backward tests (expand for details). </summary> - `sparse@sparse` ``` [--------------------------------- cpu:matmul-backward ---------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------------ torch:dense@dense | 246.1 | 315.0 | 306.9 | 168.6 | 290.6 | 146.9 torch:sparse@sparse | 6417.5 | 4393.7 | 3012.7 | 1029.4 | 908.0 | 650.7 Times are in microseconds (us). [----------------------------- cuda:matmul-backward -----------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ----------------------------------------------------------------------- torch:dense@dense | 6.7 | 13.3 | 13.3 | 6.9 | 13.5 | 6.9 torch:sparse@sparse | 143.7 | 143.4 | 119.6 | 29.5 | 29.1 | 10.9 Times are in microseconds (us). ``` - `sparse@dense` ``` [------------------------------ cpu:matmul-backward -------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------- torch:dense@dense | 185.9 | 304.8 | 305.8 | 169.9 | 308.7 | 168.4 torch:sparse@dense | 407.9 | 345.8 | 274.6 | 114.2 | 163.6 | 230.5 Times are in milliseconds (ms). [--------------------------- cuda:matmul-backward --------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------ torch:dense@dense | 6.7 | 13.3 | 13.3 | 6.9 | 13.4 | 6.9 torch:sparse@dense | 16.7 | 19.0 | 15.1 | 6.3 | 8.2 | 12.7 Times are in milliseconds (ms). ``` </details> Kindly review this PR. cc mruberry, ngimel Pull Request resolved: #51647 Reviewed By: albanD Differential Revision: D27007809 Pulled By: mruberry fbshipit-source-id: 8c1922cb3280027ca5e3eef31bfa20500c548cfd

Summary: Minor PR following up the previous PR about sparse benchmarking utils pytorch#48397 Fixes pytorch#44634: Performance benchmarks for matrix-matrix and matrix-vector ops (dense-sparse, sparse-sparse, and compare to dense-dense) I ran all benchmarks on an 2xRTX8000 machine with AMD 2970WX 24-cores for `DLMC/magnitude_pruning` dataset with different sparsity levels. --- <details><summary> forward tests (expand for details). </summary> - `sparse@sparse` ``` [------------------------------- cpu:matmul-forward -------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------- torch:dense@dense | 108.1 | 100.5 | 101.3 | 108.4 | 98.4 | 187.4 torch:sparse@sparse | 659.1 | 368.8 | 156.5 | 53.3 | 26.8 | 14.9 scipy:sparse@sparse | 565.1 | 233.9 | 130.2 | 23.1 | 21.6 | 15.2 Times are in milliseconds (ms). [----------------------------------- cuda:matmul-forward -----------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ---------------------------------------------------------------------------------- torch:dense@dense | 2243.5 | 4392.5 | 4419.8 | 2272.3 | 4433.9 | 8920.1 torch:sparse@sparse | 21369.2 | 11877.6 | 7339.2 | 1787.2 | 1335.1 | 845.7 Times are in microseconds (us). ``` - `sparse@dense` ``` [------------------------------- cpu:matmul-forward -------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------- torch:dense@dense | 105.8 | 103.8 | 103.0 | 104.4 | 104.4 | 197.0 torch:sparse@dense | 119.9 | 102.4 | 84.0 | 19.7 | 16.8 | 11.6 scipy:sparse@dense | 906.5 | 799.6 | 697.8 | 182.2 | 165.5 | 135.4 Times are in milliseconds (ms). [------------------------- cuda:matmul-forward --------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: --------------------------------------------------------------- torch:dense@dense | 2.2 | 4.4 | 4.4 | 2.3 | 4.5 | 2.3 torch:sparse@dense | 5.7 | 6.6 | 4.5 | 1.4 | 1.4 | 1.3 Times are in milliseconds (ms). ``` - `sparse@vector` ``` [----------------------------------- cpu:matmul-forward ----------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: -------------------------------------------------------------------------------- torch:dense@vector | 510.6 | 505.8 | 759.6 | 782.1 | 682.4 | 764.6 torch:sparse@vector | 10122.8 | 6241.1 | 7935.6 | 2076.3 | 1049.5 | 826.3 scipy:sparse@vector | 1756.7 | 1033.9 | 678.2 | 343.5 | 168.5 | 65.4 Times are in microseconds (us). [-------------------------------- cuda:matmul-forward --------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ---------------------------------------------------------------------------- torch:dense@vector | 36.1 | 21.5 | 21.6 | 21.5 | 21.6 | 21.5 torch:sparse@vector | 1099.2 | 1289.4 | 775.7 | 327.1 | 285.4 | 274.0 Times are in microseconds (us). ``` </details> --- <details><summary> backward tests (expand for details). </summary> - `sparse@sparse` ``` [--------------------------------- cpu:matmul-backward ---------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------------ torch:dense@dense | 246.1 | 315.0 | 306.9 | 168.6 | 290.6 | 146.9 torch:sparse@sparse | 6417.5 | 4393.7 | 3012.7 | 1029.4 | 908.0 | 650.7 Times are in microseconds (us). [----------------------------- cuda:matmul-backward -----------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ----------------------------------------------------------------------- torch:dense@dense | 6.7 | 13.3 | 13.3 | 6.9 | 13.5 | 6.9 torch:sparse@sparse | 143.7 | 143.4 | 119.6 | 29.5 | 29.1 | 10.9 Times are in microseconds (us). ``` - `sparse@dense` ``` [------------------------------ cpu:matmul-backward -------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------- torch:dense@dense | 185.9 | 304.8 | 305.8 | 169.9 | 308.7 | 168.4 torch:sparse@dense | 407.9 | 345.8 | 274.6 | 114.2 | 163.6 | 230.5 Times are in milliseconds (ms). [--------------------------- cuda:matmul-backward --------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------ torch:dense@dense | 6.7 | 13.3 | 13.3 | 6.9 | 13.4 | 6.9 torch:sparse@dense | 16.7 | 19.0 | 15.1 | 6.3 | 8.2 | 12.7 Times are in milliseconds (ms). ``` </details> Kindly review this PR. cc mruberry, ngimel Pull Request resolved: pytorch#51647 Reviewed By: albanD Differential Revision: D27007809 Pulled By: mruberry fbshipit-source-id: 8c1922cb3280027ca5e3eef31bfa20500c548cfd

Summary: This is a benchmarking tooling to work with sparse tensors. To implement this, we extended PR `benchmarking util` [https://github.com/pytorch/pytorch/issues/38338](https://github.com/pytorch/pytorch/pull/38338) for sparse tensors. In order to extend the proposed utility library the **FuzzedTensor** class was extended by creating the new **FuzzedSparseTensor** class. In addition two new operator classes were added, the `UnaryOpSparseFuzzer` and `BinaryOpSparseFuzzer`. The class `FuzzedSparseTensor` adds new input parameters to the constructor: 1. `sparse_dim`: The number of sparse dimensions in a sparse tensor. 2. `nnz`: Number of non-zero elements in the sparse tensor. 3. `density`: The density of the sparse tensor. 4. `coalesced`: As we know the sparse tensor format permits coalesced/uncoalesced sparse tensors. and removes `probability_contiguous`, `max_allocation_bytes`, `roll_parameter`, `tensor_constructor` as they are dense-tensors related parameters. In addition, I've extended the `torch.utils.benchmark.examples` to work with the new classes `FuzzedSparseTensor`, `UnaryOpSparseFuzzer` and `BinaryOpSparseFuzzer`. Hopefully, this tooling and these examples will help to make other benchmarks in other PRs. Looking forward to your thoughts and feedback. cc robieta, mruberry, ngimel Pull Request resolved: pytorch#48397 Reviewed By: ejguan Differential Revision: D26008137 Pulled By: mruberry fbshipit-source-id: 2f37811c7c3eaa3494a0f2500e519267f2186dfb

Summary: Minor PR following up the previous PR about sparse benchmarking utils pytorch#48397 Fixes pytorch#44634: Performance benchmarks for matrix-matrix and matrix-vector ops (dense-sparse, sparse-sparse, and compare to dense-dense) I ran all benchmarks on an 2xRTX8000 machine with AMD 2970WX 24-cores for `DLMC/magnitude_pruning` dataset with different sparsity levels. --- <details><summary> forward tests (expand for details). </summary> - `sparse@sparse` ``` [------------------------------- cpu:matmul-forward -------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------- torch:dense@dense | 108.1 | 100.5 | 101.3 | 108.4 | 98.4 | 187.4 torch:sparse@sparse | 659.1 | 368.8 | 156.5 | 53.3 | 26.8 | 14.9 scipy:sparse@sparse | 565.1 | 233.9 | 130.2 | 23.1 | 21.6 | 15.2 Times are in milliseconds (ms). [----------------------------------- cuda:matmul-forward -----------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ---------------------------------------------------------------------------------- torch:dense@dense | 2243.5 | 4392.5 | 4419.8 | 2272.3 | 4433.9 | 8920.1 torch:sparse@sparse | 21369.2 | 11877.6 | 7339.2 | 1787.2 | 1335.1 | 845.7 Times are in microseconds (us). ``` - `sparse@dense` ``` [------------------------------- cpu:matmul-forward -------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------- torch:dense@dense | 105.8 | 103.8 | 103.0 | 104.4 | 104.4 | 197.0 torch:sparse@dense | 119.9 | 102.4 | 84.0 | 19.7 | 16.8 | 11.6 scipy:sparse@dense | 906.5 | 799.6 | 697.8 | 182.2 | 165.5 | 135.4 Times are in milliseconds (ms). [------------------------- cuda:matmul-forward --------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: --------------------------------------------------------------- torch:dense@dense | 2.2 | 4.4 | 4.4 | 2.3 | 4.5 | 2.3 torch:sparse@dense | 5.7 | 6.6 | 4.5 | 1.4 | 1.4 | 1.3 Times are in milliseconds (ms). ``` - `sparse@vector` ``` [----------------------------------- cpu:matmul-forward ----------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: -------------------------------------------------------------------------------- torch:dense@vector | 510.6 | 505.8 | 759.6 | 782.1 | 682.4 | 764.6 torch:sparse@vector | 10122.8 | 6241.1 | 7935.6 | 2076.3 | 1049.5 | 826.3 scipy:sparse@vector | 1756.7 | 1033.9 | 678.2 | 343.5 | 168.5 | 65.4 Times are in microseconds (us). [-------------------------------- cuda:matmul-forward --------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ---------------------------------------------------------------------------- torch:dense@vector | 36.1 | 21.5 | 21.6 | 21.5 | 21.6 | 21.5 torch:sparse@vector | 1099.2 | 1289.4 | 775.7 | 327.1 | 285.4 | 274.0 Times are in microseconds (us). ``` </details> --- <details><summary> backward tests (expand for details). </summary> - `sparse@sparse` ``` [--------------------------------- cpu:matmul-backward ---------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------------ torch:dense@dense | 246.1 | 315.0 | 306.9 | 168.6 | 290.6 | 146.9 torch:sparse@sparse | 6417.5 | 4393.7 | 3012.7 | 1029.4 | 908.0 | 650.7 Times are in microseconds (us). [----------------------------- cuda:matmul-backward -----------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ----------------------------------------------------------------------- torch:dense@dense | 6.7 | 13.3 | 13.3 | 6.9 | 13.5 | 6.9 torch:sparse@sparse | 143.7 | 143.4 | 119.6 | 29.5 | 29.1 | 10.9 Times are in microseconds (us). ``` - `sparse@dense` ``` [------------------------------ cpu:matmul-backward -------------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------------- torch:dense@dense | 185.9 | 304.8 | 305.8 | 169.9 | 308.7 | 168.4 torch:sparse@dense | 407.9 | 345.8 | 274.6 | 114.2 | 163.6 | 230.5 Times are in milliseconds (ms). [--------------------------- cuda:matmul-backward --------------------------] | 0.5 | 0.7 | 0.8 | 0.9 | 0.95 | 0.98 1 threads: ------------------------------------------------------------------ torch:dense@dense | 6.7 | 13.3 | 13.3 | 6.9 | 13.4 | 6.9 torch:sparse@dense | 16.7 | 19.0 | 15.1 | 6.3 | 8.2 | 12.7 Times are in milliseconds (ms). ``` </details> Kindly review this PR. cc mruberry, ngimel Pull Request resolved: pytorch#51647 Reviewed By: albanD Differential Revision: D27007809 Pulled By: mruberry fbshipit-source-id: 8c1922cb3280027ca5e3eef31bfa20500c548cfd

facebook-github-bot added the cla signed label Nov 23, 2020

pytorchbot added the open source label Nov 23, 2020

aocsa force-pushed the aocsa/sparse_bench branch 2 times, most recently from fb8bb7a to 6093e93 Compare December 10, 2020 23:00

aocsa requested review from mruberry, ngimel and robieta December 10, 2020 23:37

aocsa changed the title ~~[WIP] Sparse benchmarking utils~~ Sparse benchmarking utils Dec 11, 2020

zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 11, 2020

aocsa mentioned this pull request Dec 14, 2020

torch.sparse improvements - tracking issue #44634

Open

26 tasks

rgommers added the module: sparse Related to torch.sparse label Dec 14, 2020

mruberry reviewed Dec 15, 2020

View reviewed changes

Comment thread torch/utils/benchmark/utils/sparse_fuzzer.py Outdated

ngimel reviewed Dec 15, 2020

View reviewed changes

aocsa force-pushed the aocsa/sparse_bench branch from a867f4d to 978125b Compare December 19, 2020 03:26

aocsa self-assigned this Jan 4, 2021

aocsa added 5 commits January 13, 2021 23:33

init sparse benchmarking proposal

783af99

update sparse bench minor refactor and tests updated minor update minor update 2

minor fix

63fdb4a

minor update

024668a

minor updates

7192db3

adding documentation and minor update

ed680bd

aocsa force-pushed the aocsa/sparse_bench branch from 045d079 to ed680bd Compare January 14, 2021 04:35

mruberry self-requested a review January 21, 2021 23:22

mruberry approved these changes Jan 21, 2021

View reviewed changes

facebook-github-bot reviewed Jan 21, 2021

View reviewed changes

facebook-github-bot closed this in 002d978 Jan 22, 2021

facebook-github-bot added the Merged label Jan 22, 2021

aocsa mentioned this pull request Feb 9, 2021

matmul performance benchmarks #51647

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse benchmarking utils#48397

Sparse benchmarking utils#48397
aocsa wants to merge 5 commits intopytorch:masterfrom
Quansight:aocsa/sparse_bench

aocsa commented Nov 23, 2020 •

edited

Loading

Uh oh!

dr-ci Bot commented Nov 23, 2020 •

edited by facebook-github-bot

Loading

Uh oh!

codecov Bot commented Nov 23, 2020 •

edited

Loading

Uh oh!

Uh oh!

ngimel left a comment

Uh oh!

Uh oh!

Uh oh!

ngimel Dec 15, 2020

Uh oh!

aocsa Jan 11, 2021

Uh oh!

Uh oh!

aocsa commented Jan 14, 2021

Uh oh!

mruberry commented Jan 21, 2021

Uh oh!

mruberry left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented Jan 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

aocsa commented Nov 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci Bot commented Nov 23, 2020 • edited by facebook-github-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

❄️ 1 failure tentatively classified as flaky

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 (1/1)

ci.pytorch.org: 1 failed

Uh oh!

codecov Bot commented Nov 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

ngimel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ngimel Dec 15, 2020

Choose a reason for hiding this comment

Uh oh!

aocsa Jan 11, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aocsa commented Jan 14, 2021

Uh oh!

mruberry commented Jan 21, 2021

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jan 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

aocsa commented Nov 23, 2020 •

edited

Loading

dr-ci Bot commented Nov 23, 2020 •

edited by facebook-github-bot

Loading

codecov Bot commented Nov 23, 2020 •

edited

Loading