Avoid introducing local variable (and GC frame store) in unsafe_setindex!#13461
Avoid introducing local variable (and GC frame store) in unsafe_setindex!#13461
unsafe_setindex!#13461Conversation
|
................. The extra allocation is due to #13359 ...... |
|
Note that the code generated for this PR is still better than the one for #13463 since there's no gc frame at all. |
2d39883 to
ec3d2f1
Compare
|
@yuyichao it would be good to fix the underlying problem but this looks fine, the original code wasn't really idiomatic anyways. |
…r is confused about the return point.
ec3d2f1 to
bb247cf
Compare
|
Interesting. I didn't realize there was a penalty for |
|
I did a |
|
This is awesome. Looks like it speeds up the non-scalar array perf tests by 5-15%. I agree that we should get a better fix for this eventually, but this is great for now. |
Avoid introducing local variable (and GC frame store) in `unsafe_setindex!`
|
backported in adb832a |
This is a workaround to avoid the generation of a GC frame store when
unsafe_setindex!is called directly.The problem seems to be that the type inference is not able to figure out the exit point of
@inbounds return .... (Or more precisely, the second return is never reachable and the result is not used anyway....)The code above generates a pointer local variable and disables SIMD. There are many fixes/optimizations that we can do to avoid the store in the loop at multiple levels but I think this workaround should be the safest to backport to 0.4 with minimum side effect (not necessarily 0.4.0).
This is an alternative solution to what's in #13459.
It seems that
A[:] = 0.still generates one allocation whilefill!doesn't so #13459 might still be good to have (or we should find why isA[:] = 0.allocate and fix that in general). (edit: It's the splatting panelty, see #13461 (comment))