Skip to content

dropmissing! fails when one column in DataFrame is of type bitvector #3300

@KevDAll

Description

@KevDAll

DataFrames.jl Version 1.5.0.
Julia Version 1.6.1

When one or more columns in an array are of type bitvector, dropmissing! will fail.

The issue appears to be caused at line 922 of dataframe.jl (_deleteat!_helper)

deleteat!(col, drop)

The problem here is that bitarray.deleteat!(B::Bitvector, inds) takes precedence over array.deleteat!(a::Vector, inds::AbstractVector{Bool}).

Version 1.7.0 of base adds the signature bitarray.deleteat!(B::Bitvector, inds::AbstractVector{Bool}) which will fix the issue.

An alternative would be update the project.toml to better indicate the compatibility with Base.

Minimal Example:

Expected Behavior (occurs if B::Vector{Bool}):

x = DataFrame(:A=>[1,2,missing], :B=>[false, false, false])
dropmissing!(x)
print(x)

2×2 DataFrame
 Row │ A      B
     │ Int64  Bool
─────┼──────────────
   1 │     1  false
   2 │     2  false

Observed Behavior (occurs if B::BitArray):

x = DataFrame(:A=>[1,2,missing], :B=>falses(3))
dropmissing!(x)
print(x)

ERROR: BoundsError: attempt to access 3-element BitVector at index [false]
Stacktrace:
 [1] throw_boundserror(A::BitVector, I::Tuple{Bool})
   @ Base ./abstractarray.jl:651
 [2] checkbounds
   @ ./abstractarray.jl:616 [inlined]
 [3] deleteat!(B::BitVector, inds::BitVector)
   @ Base ./bitarray.jl:989
 [4] _deleteat!_helper(df::DataFrame, drop::BitVector)
   @ DataFrames ~/.julia/packages/DataFrames/LteEl/src/dataframe/dataframe.jl:922
 [5] deleteat!
   @ ~/.julia/packages/DataFrames/LteEl/src/dataframe/dataframe.jl:894 [inlined]
 [6] dropmissing!(df::DataFrame, cols::Colon; disallowmissing::Bool)
   @ DataFrames ~/.julia/packages/DataFrames/LteEl/src/abstractdataframe/abstractdataframe.jl:1081
 [7] dropmissing! (repeats 2 times)
   @ ~/.julia/packages/DataFrames/LteEl/src/abstractdataframe/abstractdataframe.jl:1079 [inlined]
 [8] top-level scope
   @ REPL[12]:1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions