Skip to content

array_has is 3200x slower than it "should be" #12062

@samuelcolvin

Description

@samuelcolvin

Describe the bug

The performance of array_has seems to be pretty poor due to RowConverter.

I compared running array_has queries vs json_contain (not a particularly good comparison, but that's not the point here), I'd expect array_has to be somewhat faster but it's actually 3200x slower:

+----------+
| count(*) |
+----------+
| 4828     |
+----------+
mode: SELECT count(*) FROM test where json_contains(json, 'service.name'), query took 31.696875ms
+----------+
| count(*) |
+----------+
| 4828     |
+----------+
mode: SELECT count(*) FROM test where array_has(list, 'service.name'), query took 102.430949125s

Code for this example is here, and here is a flame graph from samply, you can see that 99% of time is in RowConverter:

image

To Reproduce

Clone https://github.com/samuelcolvin/array-has-slow and run cargo run --release.

Expected behavior

array_has should be much faster.

Most of the problematic behaviour is in RowConverter, but I also think it should be much faster by making general_array_has_dispatch special cased or generic around ComparisonType rather than branching in the hot loop.

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions