Skip to content

add specialized copy! for sparse matrices#15104

Merged
mbauman merged 1 commit intoJuliaLang:masterfrom
KristofferC:kc/sparse_copy
Feb 18, 2016
Merged

add specialized copy! for sparse matrices#15104
mbauman merged 1 commit intoJuliaLang:masterfrom
KristofferC:kc/sparse_copy

Conversation

@KristofferC
Copy link
Copy Markdown
Member

When the matrices have the same size we can get away with resizing + copy of internal fields.

Ex:

A = sprand(10^4, 10^4, 0.01);
B = sprand(10^4, 10^4, 0.01);

# PR:
@time copy!(A,B)
  0.002139 seconds (4 allocations: 160 bytes)

# Master:
# Didn't finish in + 10 seconds

The use case I had for this was in a nonlinear solver package where I needed to update the input matrix in the gradient function. Since I always created my matrix from a triplet format I used copy! to do the updating but noticed this was slow.

@tkelman tkelman added the sparse Sparse arrays label Feb 16, 2016
@KristofferC
Copy link
Copy Markdown
Member Author

I guess the potentially controversial thing can be the invoke but this seems like a good time to use it.

copy!(A.rowval, B.rowval)
copy!(A.nzval, B.nzval)
else
invoke(Base.copy!, Tuple{AbstractMatrix{TvA}, AbstractMatrix{TvB}}, A, B)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is always an error, right? Why not put an error message here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No error if prod(size(A)) > prod(size(B)) e.g

julia> a = rand(5,5); b = rand(2,2);

julia> copy!(a, b);

julia> a[1:4] == b[1:4]
true

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I should look if the prods are equal instead of the sizes since we overwrite every value in A then anyway.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. I thought the fallback would check that the sizes match or, at least, the lengths. I'm wondering if anyone is actually relying on copying a smaller array into a larger. I'd prefer an error if the lengths are not equal.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also give offsets to both arrays

julia> a = rand(5); b = rand(3);

julia> copy!(a, 2, b, 2);

julia> a[2:3] == b[2:end]
true

I don't think it is too uncommon to want to copy a chunk of an array into some other place and for now I think copy! is the most efficient.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is mainly reasonable behavior for Arrays. For AbstractArrays it would more sense to require equal sizes and thereby the use of SubArrays for copying parts of an array.

Do you think this branch is useful in practice?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just didn't want to change any current behavior.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. If necessary, it can be handled in a different PR.

@KristofferC
Copy link
Copy Markdown
Member Author

Updated to check for equality in number of elements instead of size.

# elements in A will be overwritten and we can simply copy the
# internal fields of B to A.
if prod(size(A)) == prod(size(B))
copy!(A.colptr, B.colptr)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may need to resize colptr if number of columns does not match, and length should be more efficient than prod(size(A))

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there be a test for this situation where size(A) != size(B) but length(A) == length(B), just to be on the safe side?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or given @mbauman's catch below, maybe not worth the trouble to do the specialized path when size(A) != size(B)

@KristofferC KristofferC force-pushed the kc/sparse_copy branch 2 times, most recently from fef892e to b32e971 Compare February 16, 2016 19:53
@KristofferC KristofferC changed the title RFC: add specialized copy! for sparse matrices add specialized copy! for sparse matrices Feb 16, 2016
@mbauman
Copy link
Copy Markdown
Member

mbauman commented Feb 16, 2016

You'll have to do some extra reshaping work on the indices if the matrices aren't the same size. This currently leaves the destination matrix in a bad state:

julia> A = sprandn(4,1,0.0);
       B = sprandn(2,2,1.0);
       copy!(A, B)
4x1 sparse matrix with 4 Float64 entries:
    [1, 1]  =  -0.40374
    [2, 1]  =  -1.01472

@KristofferC
Copy link
Copy Markdown
Member Author

Thanks @mbauman

I changed back to only check for size equality. The only reason I checked for length is because I incorrectly assumed it would be trivial. I don't feel like messing around with reshaping the indices vectors etc, I just wanted something fast for when the matrices are the same size (which I think will be the vast majority of cases).

@KristofferC KristofferC force-pushed the kc/sparse_copy branch 2 times, most recently from 3e596db to be3cf62 Compare February 17, 2016 14:51
when the matrices have the same size we can get away with simple copying of the internal fields.
@KristofferC
Copy link
Copy Markdown
Member Author

This is good to go from my p.o.v.

@tkelman
Copy link
Copy Markdown
Contributor

tkelman commented Feb 18, 2016

Unless anyone cringes in horror every time we add a usage of invoke I agree this is good to merge.

@mbauman
Copy link
Copy Markdown
Member

mbauman commented Feb 18, 2016

LGTM — this solves a performance problem many orders of magnitude worse than invoke. That branch can always be improved later.

mbauman added a commit that referenced this pull request Feb 18, 2016
add specialized copy! for sparse matrices
@mbauman mbauman merged commit 84e0a18 into JuliaLang:master Feb 18, 2016
@KristofferC KristofferC deleted the kc/sparse_copy branch February 18, 2016 13:39
@KristofferC
Copy link
Copy Markdown
Member Author

Danke! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sparse Sparse arrays

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants