Skip to content

Faster TensorProductOperator#59

Merged
ChrisRackauckas merged 11 commits intoSciML:masterfrom
vpuri3:permutedims
Jun 18, 2022
Merged

Faster TensorProductOperator#59
ChrisRackauckas merged 11 commits intoSciML:masterfrom
vpuri3:permutedims

Conversation

@vpuri3
Copy link
Member

@vpuri3 vpuri3 commented Jun 17, 2022

fixes #58

using SciMLOperators, LinearAlgebra
using BenchmarkTools

A = TensorProductOperator(rand(12,12), rand(12,12), rand(12,12))

u = rand(12^3, 100)
v = rand(12^3, 100)

A = cache_operator(A, u)

mul!(v, A, u) # dunny
@btime mul!($v, $A, $u); # 4.510 ms (17 allocations: 31.36 KiB)
julia> versioninfo()
Julia Version 1.8.0-rc1
Commit 6368fdc6565 (2022-05-27 18:33 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.4.0)
  CPU: 4 × Intel(R) Core(TM) i5-5257U CPU @ 2.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, broadwell)
  Threads: 4 on 4 virtual cores
Environment:
  JULIA_NUM_PRECOMPILE_TASKS = 4
  JULIA_DEPOT_PATH = /Users/vp/.julia
  JULIA_NUM_THREADS = 4

this is on par with linearmaps in terms of speed, and miles ahed in terms of allocations. ref #58 (comment)

@codecov
Copy link

codecov bot commented Jun 17, 2022

Codecov Report

Merging #59 (2eaede7) into master (746c0d5) will not change coverage.
The diff coverage is 0.00%.

@@          Coverage Diff           @@
##           master     #59   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files           6       6           
  Lines         792     822   +30     
======================================
- Misses        792     822   +30     
Impacted Files Coverage Δ
src/basic.jl 0.00% <0.00%> (ø)
src/sciml.jl 0.00% <0.00%> (ø)
src/utils.jl 0.00% <ø> (ø)

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@vpuri3 vpuri3 changed the title [WIP] Faster TensorProductOperator Faster TensorProductOperator Jun 17, 2022
@vpuri3
Copy link
Member Author

vpuri3 commented Jun 17, 2022

@ChrisRackauckas good to go

@vpuri3
Copy link
Member Author

vpuri3 commented Jun 18, 2022

for reference, a single component matvec of the above example takes ~ 600 \mu s.

using LinearAlgebra, BenchmarkTools
u = rand(12, 12*12*100)
v = rand(12, 12*12*100)
A = rand(12,12)
mul!(v, A, u) # dummy
@btime mul!($v, $A, $u) # 616.410 μs (0 allocations: 0 bytes)

The entire tensor product takes 7x that time (4.5ms), and we have only three matvecs ~2ms. and the permutedims only take 200\mu s each. so there is surely some performance we can find. @chriselrod @ChrisRackauckas

@ChrisRackauckas ChrisRackauckas merged commit c948888 into SciML:master Jun 18, 2022
@vpuri3 vpuri3 deleted the permutedims branch June 18, 2022 03:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

make tensor products faster

2 participants