Skip to content

array_has function returns null for an empty list ([]) instead of false #16474

@bert-beyondloops

Description

@bert-beyondloops

Describe the bug

When using the array_has function with an empty list, the result is

Currently :

array_has([], 1) => null.
array_has(null, 1) => null.

In my opinion, it should only return null when the list itself is null.

In duckDb:
array_has([], 1) => false.
array_has(null, 1) => null.

To Reproduce

CREATE TABLE foo ( bar INTEGER[] );

INSERT INTO foo VALUES 
( [1, 3, 5]),
(  []),
( null)
;

SELECT bar, IFNULL(CAST(list_contains(bar, 1) AS VARCHAR), 'null') FROM foo;

will result in :


+-----------+-----------------------------------------------+
| bar       | nvl(array_has(foo.bar,Int64(1)),Utf8("null")) |
+-----------+-----------------------------------------------+
| [1, 3, 5] | true                                          |
| []        | null                                          |
|           | null                                          |
+-----------+-----------------------------------------------+

whereas for example in duckDb you get :


┌───────────┬──────────────────────────────────────────────────────────┐
│    bar    │ COALESCE(CAST(list_contains(bar, 1) AS VARCHAR), 'null') │
│  int32[]  │                         varchar                          │
├───────────┼──────────────────────────────────────────────────────────┤
│ [1, 3, 5] │ true                                                     │
│ []        │ false                                                    │
│           │ null                                                     │
└───────────┴──────────────────────────────────────────────────────────┘

Expected behavior

array_has([], 1) => false.
array_has(null, 1) => null.

Additional context

According to me, the issue is situated in the array_has_dispatch_for_scalar method :

...
for (i, offset) in offsets.windows(2).enumerate() {
        let start = offset[0].to_usize().unwrap();
        let end = offset[1].to_usize().unwrap();
        let length = end - start;
        // For non-nested list, length is 0 for null
        if length == 0 {
            continue;
        }
        let sliced_array = eq_array.slice(start, length);
        final_contained[i] = Some(sliced_array.true_count() > 0);
    }
...

There is no distinction between null or length = 0.
You cannot only look at the offset difference. You have to look at the validity buffer as well.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions