Fix `extrema(x; dims)` for inputs with `NaN/missing` by N5N3 · Pull Request #43604 · JuliaLang/julia

N5N3 · 2021-12-30T11:26:43Z

The original goal of this PR is to Fix #43599. But I find some more cases whose result is inconsistent with the definition.
The behaviour changes are summarized as follows:

extrema(x; dims) now respects -0,0/0.0 (i.e. extrema([-0.0;0.0]; dims = 1) == [(-0.0, 0.0)])
extrema(x; dims) now respects NaN
extrema(x; dims) can handle inputs with missing
extrema(x; dims) disallows empty reduction like minimun and maximum.
minimum(x) and maximum(x) give correct result if the input contains both NaN and missing

The last one is used in test so I fix it in this PR by limiting the related optimizaiton to Float only cases. (see 76637c7)

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

Co-authored-by: Tim Holy <tim.holy@gmail.com>

N5N3 · 2022-01-05T02:08:38Z

@tkf is there a strong objection for #36265?
I think it should be good to merge that first, and rebase this one.

JeffBezanson · 2022-01-05T20:08:34Z

I like #36265, but it doesn't seem to solve the issue with missing?

N5N3 · 2022-01-06T01:26:15Z

I like #36265, but it doesn't seem to solve the issue with missing?

I missed the following:

julia/base/multidimensional.jl

Lines 1754 to 1762 in 5669d63

    
           function _extrema_dims(f::F, A::AbstractArray, dims, ::_InitialValue) where {F} 
        
               sz = size(A) 
        
               for d in dims 
        
                   sz = setindex(sz, 1, d) 
        
               end 
        
               T = promote_op(f, eltype(A)) 
        
               B = Array{Tuple{T,T}}(undef, sz...) 
        
               return extrema!(f, B, A) 
        
           end

So it seems that PR also didn't solve NaN.

But the _DupY(f) design looks good to me, and it seems reasonable to use it in this PR:

julia> @btime extrema($a)
  1.222 μs (0 allocations: 0 bytes)
(-3.8158492176948755, 4.389900419602923)

julia> @btime extrema(Float64,$a)
  616.500 μs (12298 allocations: 320.52 KiB)
(-3.8158492176948755, 4.389900419602923)

This optimization gives wrong result, if `NaN` and `missing` both exist. add missing test

Add TODO apply suggestions Update NEWS.md Co-Authored-By: Takafumi Arakaki <takafumi.a@gmail.com> Co-Authored-By: Takafumi Arakaki <29282+tkf@users.noreply.github.com>

Mark `BigInt` as broken

tkf

LGTM! (other than one nitpick and that we need to update SparseArrays)

base/reducedim.jl

tkf · 2022-01-17T01:37:42Z

I followed https://julialang.github.io/BumpStdlibs.jl/dev/usage/ and triggered the bump https://github.com/JuliaLang/BumpStdlibs.jl/runs/4835402987?check_suite_focus=true

It looks like the magic worked #43833

N5N3 · 2022-01-17T01:47:16Z

Local test shows no problem

$ make -C test SparseArrays
make: Entering directory '/cygdrive/c/Users/MYJ/Documents/GitHub/julia/test'
    JULIA test/SparseArrays
Test                    (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
SparseArrays/sparsevector    (4) |        started at 2022-01-17T01:38:57.142
SparseArrays/sparse          (3) |        started at 2022-01-17T01:38:57.262
SparseArrays/higherorderfns  (2) |        started at 2022-01-17T01:38:57.358
SparseArrays/higherorderfns  (2) |    94.64 |   2.66 |  2.8 |   11985.15 |   562.00
SparseArrays/sparsevector    (4) |   164.82 |   4.87 |  3.0 |   24421.82 |   876.65
SparseArrays/sparse          (3) |   263.28 |  23.08 |  8.8 |   27776.47 |   953.35

Test Summary: |  Pass  Broken  Total     Time
  Overall     | 21792       5  21797  4m26.2s
    SUCCESS

The above suggestion will be adopted together with the update of SparseArrays.

Update reducedim.jl

tkf

LGTM! Thanks for addressing a lot of fix requests!

* Define `extrema` using `mapreduce`; support `init` * Fix tests for SparseArrays * API clean and export `extrema!` * Re-implement `reducedim_init` for extrema * Merge `master` to pull in JuliaSparse/SparseArrays.jl#63 * Mark `BigInt` tests as broken Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr> Co-authored-by: Simeon Schaub <simeondavidschaub99@gmail.com> Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com> Co-authored-by: Tim Holy <tim.holy@gmail.com>

greimel · 2022-04-01T07:08:01Z

Sorry if it's not appropriate to comment on this long-merged PR.

In the docstring it says

The value returned for empty itr can be specified by init. It must be a 2-tuple whose
first and second elements are neutral elements for min and max respectively
(i.e. which are greater/less than or equal to any other element).

However, one very important use-case of this PR could be to compute extrema iteratively (e.g. update bounds as new data come in).

julia> x = 0:10
0:10

julia> y = -1:5
-1:5

julia> extrema(x)
(0, 10)

julia> extrema(y)
(-1, 5)

julia> extrema(y, init=extrema(x)) == extrema(x ∪ y) == (-1, 10)
true

The docstring makes me wonder if this is a valid use of the function, because it says (i.e. which are greater/less than or equal to any other element). That is, can I be sure that this condition will be checked in the future and give an error if it's not satisfied?

Would you be open to a PR that adds my example to the docstring and the tests?

N5N3 · 2022-04-01T08:12:31Z

I believe @tkf followed minimum/maximum's docstring, as for maximum

  The value returned for empty itr can be specified by init. It must be a neutral element for max (i.e. which is less than or equal to any other element) as it is       
  unspecified whether init is used for non-empty collections.

And we have the following in sum's docstring

  The value returned for empty itr can be specified by init. It must be the additive identity (i.e. zero) as it is unspecified whether init is used for non-empty        
  collections.

However, your use case is valid (and intuitive) under current implemenation.
If we want to clarify it, then we'd better also modify other reductions' docstring to keep consistancy.
Anyway, PR (with goodwill) is always welcome.

greimel · 2022-04-01T08:16:10Z

Great! I've prepared #44819 with updates for minimum, maximum and extrema.

tkf and others added 15 commits June 12, 2020 20:41

Define extrema using mapreduce; support init

7b0e72e

Add NEWS

c6ffc87

Fix a typo

194c329

Reword a comment slightly

793a020

Merge branch 'master' into extrema

de7b9bc

Apply suggestions from code review

5ebf8ad

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>

Fixup docstrings

2d70b7d

Add simple tests for Core.Compiler.extrema

317548a

Name the output tuple elements (mn, mx)

360b346

Merge remote-tracking branch 'origin/master' into extrema

841de8a

Merge branch 'master' into extrema

2533b14

Update base/multidimensional.jl

f27f874

Fix compat annotations

ce07cbf

Improve extrema docstring

3ceaaae

Co-authored-by: Tim Holy <tim.holy@gmail.com>

Fix tests for SparseArrays

5669d63

N5N3 force-pushed the extrama branch 5 times, most recently from 19ce317 to 66d2a81 Compare December 31, 2021 10:11

N5N3 mentioned this pull request Jan 1, 2022

Accelerate minimum(A; dims = 1) for cartesian indexed cases. #43618

Closed

N5N3 force-pushed the extrama branch 6 times, most recently from e09c63d to e015b3e Compare January 5, 2022 01:14

N5N3 and others added 5 commits January 16, 2022 15:51

remove short circuit for correctness

d9fce2a

This optimization gives wrong result, if `NaN` and `missing` both exist. add missing test

Remove _extrema_rf optimization for AbstractFloat (follow advise)

065a28f

Add TODO apply suggestions Update NEWS.md Co-Authored-By: Takafumi Arakaki <takafumi.a@gmail.com> Co-Authored-By: Takafumi Arakaki <29282+tkf@users.noreply.github.com>

Re-implement reducedim_init for extrema

b6a93bd

Test fix

a2a039d

Mark `BigInt` as broken

Merge branch 'master' into extrama

0486330

N5N3 force-pushed the extrama branch from a976a0a to 0486330 Compare January 16, 2022 07:57

tkf reviewed Jan 16, 2022

View reviewed changes

base/reducedim.jl Outdated Show resolved Hide resolved

N5N3 added 2 commits January 17, 2022 17:35

Inference fix

98d2743

Update reducedim.jl

Merge branch 'master' into extrama

2c57b1f

N5N3 force-pushed the extrama branch from e4dcb1f to 2c57b1f Compare January 18, 2022 01:00

tkf approved these changes Jan 18, 2022

View reviewed changes

tkf added the merge me PR is reviewed. Merge when all tests are passing label Jan 18, 2022

tkf merged commit aa1bd28 into JuliaLang:master Jan 18, 2022

tkf removed the merge me PR is reviewed. Merge when all tests are passing label Jan 18, 2022

N5N3 deleted the extrama branch January 18, 2022 23:26

N5N3 mentioned this pull request Jan 19, 2022

Code clean for extrema JuliaSparse/SparseArrays.jl#74

Merged

tkf mentioned this pull request Jan 26, 2022

Define extrema using mapreduce; support init #36265

Closed

greimel mentioned this pull request Apr 1, 2022

Clarify the possible uses of the init keyword in minimum, maximum and extrema #44819

Open

N5N3 mentioned this pull request Jul 5, 2022

Correctness issues with minimum and maximum #45932

Closed

aramirezreyes mentioned this pull request Jul 15, 2022

ERROR: MethodError: no method matching initarray! on Julia v1.8.0-rc1 CliMA/Oceananigans.jl#2663

Closed

nalimilan mentioned this pull request Oct 24, 2023

maximum([fill(NaN, 255); missing]) !== maximum([fill(NaN, 256); missing]) (Should we fix reduce(max, ...) etc.?) #36287

Closed

nalimilan mentioned this pull request Nov 15, 2023

Fix minimum/maximum over dimensions with missing values #35323

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix `extrema(x; dims)` for inputs with `NaN/missing`#43604

Fix `extrema(x; dims)` for inputs with `NaN/missing`#43604
tkf merged 25 commits intoJuliaLang:masterfrom
N5N3:extrama

N5N3 commented Dec 30, 2021 •

edited

Loading

Uh oh!

N5N3 commented Jan 5, 2022

Uh oh!

JeffBezanson commented Jan 5, 2022

Uh oh!

N5N3 commented Jan 6, 2022 •

edited

Loading

Uh oh!

tkf left a comment •

edited

Loading

Uh oh!

Uh oh!

tkf commented Jan 17, 2022 •

edited

Loading

Uh oh!

N5N3 commented Jan 17, 2022 •

edited

Loading

Uh oh!

tkf left a comment

Uh oh!

greimel commented Apr 1, 2022

Uh oh!

N5N3 commented Apr 1, 2022

Uh oh!

greimel commented Apr 1, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

N5N3 commented Dec 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

N5N3 commented Jan 5, 2022

Uh oh!

JeffBezanson commented Jan 5, 2022

Uh oh!

N5N3 commented Jan 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tkf left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tkf commented Jan 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

N5N3 commented Jan 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tkf left a comment

Choose a reason for hiding this comment

Uh oh!

greimel commented Apr 1, 2022

Uh oh!

N5N3 commented Apr 1, 2022

Uh oh!

greimel commented Apr 1, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

N5N3 commented Dec 30, 2021 •

edited

Loading

N5N3 commented Jan 6, 2022 •

edited

Loading

tkf left a comment •

edited

Loading

tkf commented Jan 17, 2022 •

edited

Loading

N5N3 commented Jan 17, 2022 •

edited

Loading