fix map! undefined value exposure#56673
Conversation
|
pretty sure build failures are unrelated |
|
This looks like a simple typo: Line 3466 in a17db2b That's not how |
|
typo aside, it's checking the wrong thing. that's checking that literally all the indices are equal.
but for a correct implementation to match the docstring, I think this is one of those cases where Julia skipped directly to "make it fast" before "make it right" |
LilithHafner
left a comment
There was a problem hiding this comment.
Thanks for addressing this significant issue. Much appreciated. A related issue that you might want to be aware of, but don't need to fix in this PR is #56696.
I agree that it is acceptable to take some performance regressions in exchange for fixing this bug. I have some refactoring suggestions to make it a bit simpler.
I'm planning to run benchmarks once we agree on an implementation, but if you want me to run them right away instead I'd be happy to do that.
base/abstractarray.jl
Outdated
| idxs1 = eachindex(LinearIndices(As[1])) | ||
| @boundscheck begin | ||
| idxs1 = intersect(map(eachindex ∘ LinearIndices, As)...) | ||
| checkbounds(dest, idxs1) | ||
| end | ||
| for i = idxs1 |
There was a problem hiding this comment.
| idxs1 = eachindex(LinearIndices(As[1])) | |
| @boundscheck begin | |
| idxs1 = intersect(map(eachindex ∘ LinearIndices, As)...) | |
| checkbounds(dest, idxs1) | |
| end | |
| for i = idxs1 | |
| idxs = mapreduce(LinearIndices, intersect, As) | |
| checkbounds(dest, inds) | |
| for i = idxs |
- I don't see why we need
eachindexis here - We should only include bounds-check elision if benchmarks reveal that it makes a difference. In this case, we would also need to add
@propagate_inboundstomap!to make it take effect. - I think that
mapreduce(LinearIndices, intersect, As)is more elegant thanintersect(map(LinearIndices, As)...), unless there is some performance reason to use the latter. - If always compute the intersection of indices then
indsis a better name thaninds1
There was a problem hiding this comment.
thank you for the comments!
the eachindex I added for the following reason
julia> mapreduce(LinearIndices, intersect, [A, B])
3-element Vector{Int64}:
1
2
3
julia> mapreduce(eachindex ∘ LinearIndices, intersect, [A, B])
Base.OneTo(3)
so I figured that if we can hit fast-path intersection for ranges rather than collecting the indices we should. Another solution might be to add a fast intersect(::LinearIndices, ::LinearIndices) ?
the mapreduce formulation indeed looks nicer to me as well.
- If always compute the intersection of indices then
indsis a better name thaninds1
Agreed! but I suppose we should decide if in the first place @inbounds should allow skipping the check and iterate through the indices of the first argument? I think this is a decision on semantics rather than implementation that I am not empowered to make. the docs should probably be updated if this is intended
There was a problem hiding this comment.
the
eachindexI added for the following reason
Good reason.
I suppose we should decide if in the first place
@inboundsshould allow skipping the check and iterate through the indices of the first argument?
@inbounds should never expose documented semantics that are otherwise not present. This is because you can't count on bounds checks being removed. For exmaple, Julia could be run with -check-bound=yes or the boundschecking might not be inlined into the callsite where @inbounds is present. In either case, the bounds checks will remain. Consequently I think the answer to that question is no: @inbounds should not change the semantics other than to skip bounds checks.
test/abstractarray.jl
Outdated
| @test_throws ArgumentError map!(-, [1]) | ||
|
|
||
| # Issue #30624 | ||
| @test map!(+, [0,0,0], [1,2], [10,20,30], [100]) == [111,0,0] |
There was a problem hiding this comment.
Would you also add a test for #36235 (comment), please?
…eted_inplace_map_idxs
|
LilithHafner
left a comment
There was a problem hiding this comment.
Previously, map!(+, ones(1), ones(2, 2)) and map!(+, ones(1), ones(2, 2), ones(2, 2)) worked and this makes them throw (as the docstring indicates they should).
Performance is, at a glance, sometimes slightly better and sometimes slightly worse than it was on master. Correctness first, performance optimization second so I'm not too concerned.
I like that the implementation for map_n! is a bit simpler now and that it clearly aligns with the implementations of 1-arg and 2-arg map.
Unfortunately we have to deal with them before merging because otherwise existing, valid, code that uses OffsetArrays will break. For example, julia> R = fill(0, -4:-1);
julia> B = OffsetArray(reshape(1:12, 4, 3), -5, 6);
julia> minimum!(R, B)
4-element OffsetArray(::Vector{Int64}, -4:-1) with eltype Int64 with indices -4:-1:
1
2
3
4This is a test failure from CI which fails because the implementation of julia> x = fill(0, 1:3);
julia> y = fill(1, 2:4);
julia> map!(identity, x, y)
3-element OffsetArray(::Vector{Int64}, 1:3) with eltype Int64 with indices 1:3:
1
1
1@nanosoldier |
|
I guess we'll have to iterate through destination idxs separately to fix the I could be wrong but this seems more likely to hurt performance. Would there be a meaningful difference here between iterating through |
|
The package evaluation job you requested has completed - possible new issues were detected. |
|
Oh, wow! Nanosoldier looks clean. Rerunning failures to be sure: @nanosoldier |
Given how nice PkgEval is looking, another option would be to fix the
I doubt it, but hopefully that won't be necessary. If we can implement it in a way that isn't practically breaking, I'd love to make |
|
I'm finding all the indirection in but I guess the "real" issue is in in your example, we have so then on |
|
The package evaluation job you requested has completed - possible new issues were detected. |
|
at the very least, this PR would swap "bad" bugs (segfaults, undefined behavior) for "regular" bugs (something throwing a |
in since a more comprehensive refactor to dimensional reductions appears to be still a ways away, I'd suggest leniency for "ugly" bugfixes like this. |
|
@nanosoldier |
|
The package evaluation job you requested has completed - possible new issues were detected. Report summary❗ Packages that crashed6 packages crashed on the previous version too. ✖ Packages that failed23 packages failed only on the current version.
1536 packages failed on the previous version too. ✔ Packages that passed tests27 packages passed tests only on the current version.
5121 packages passed tests on the previous version too. ~ Packages that at least loaded8 packages successfully loaded only on the current version.
2855 packages successfully loaded on the previous version too. ➖ Packages that were skipped altogether905 packages were skipped on the previous version too. |
|
ok, I narrowed the scope. only as you note, this would leave open the question of input validation when the docstring is violated, but at least it fixes the segfaults and uninitialized value exposure (which was my main goal) |
hopefully more targeted / less disruptive than #50647 , would (at least partially) fix #36235
may need extra documentation explaining that with this implementation,
@inbounds map!(dest, As)will always iterate through the indices ofAs[1]regardless of the other sizestbh, the existing implementation is pretty fundamentally incorrect and does not match the behavior explicitly described in the docstring, so I think that this PR (or #50647 , or something else similar) should merge regardless of benchmarks impact. however, I would appreciate if someone could trigger a benchmark run just so we can see the impact :)