-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: EXPLAIN ANALYZE (DEBUG) 4-12x slower for queries with large lookup joins #90739
Copy link
Copy link
Open
Labels
A-sql-explainIssues related to EXPLAIN and EXPLAIN ANALYZE improvementsIssues related to EXPLAIN and EXPLAIN ANALYZE improvementsC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-sql-queriesSQL Queries TeamSQL Queries TeamX-nostaleMarks an issue/pr that should be ignored by the stale botMarks an issue/pr that should be ignored by the stale bot
Description
Queries that perform large lookup joins (tens of thousands of rows) are roughly 4-12x slower to EXPLAIN ANALYZE than to execute normally. This seems to be true on 21.2, 22.1, 22.2, and probably earlier releases as well. Here's a demonstration on 22.2.0:
CREATE TABLE t (a PRIMARY KEY, b) AS SELECT i, i FROM generate_series(0, 99999) s(i);
CREATE TABLE u (x PRIMARY KEY, y) AS SELECT * FROM t;
ANALYZE t;
ANALYZE u;
SELECT count(*) FROM t JOIN u@{NO_FULL_SCAN} ON x = b WHERE a >= 50000;
EXPLAIN ANALYZE SELECT count(*) FROM t JOIN u@{NO_FULL_SCAN} ON x = b WHERE a >= 50000;On a one-node cockroach demo the normal execution takes 274 ms, and the EXPLAIN ANALYZE takes 3.29s:
demo@127.0.0.1:26257/defaultdb> SELECT count(*) FROM t JOIN u@{NO_FULL_SCAN} ON x = b WHERE a >= 50000;
count
---------
50000
(1 row)
Time: 274ms total (execution 274ms / network 0ms)
demo@127.0.0.1:26257/defaultdb> EXPLAIN ANALYZE SELECT count(*) FROM t JOIN u@{NO_FULL_SCAN} ON x = b WHERE a >= 50000;
info
--------------------------------------------------------------------------------------------
planning time: 1ms
execution time: 3.3s
distribution: full
vectorized: true
rows read from KV: 100,000 (2.4 MiB, 5 gRPC calls)
cumulative time spent in KV: 3.2s
maximum memory usage: 23 MiB
network usage: 0 B (0 messages)
regions: us-east1
• group (scalar)
│ nodes: n1
│ regions: us-east1
│ actual row count: 1
│ estimated row count: 1
│
└── • lookup join
│ nodes: n1
│ regions: us-east1
│ actual row count: 50,000
│ KV time: 3.2s
│ KV contention time: 0µs
│ KV rows read: 50,000
│ KV bytes read: 766 KiB
│ KV gRPC calls: 4
│ estimated max memory allocated: 22 MiB
│ estimated row count: 50,067
│ table: u@u_pkey
│ equality: (b) = (x)
│ equality cols are key
│
└── • scan
nodes: n1
regions: us-east1
actual row count: 50,000
KV time: 17ms
KV contention time: 0µs
KV rows read: 50,000
KV bytes read: 1.6 MiB
KV gRPC calls: 1
estimated max memory allocated: 1.6 MiB
estimated row count: 49,896 (50% of the table; stats collected 24 seconds ago)
table: t@t_pkey
spans: [/50000 - ]
(44 rows)
Time: 3.290s total (execution 3.289s / network 0.000s)
We record many events in the trace for each lookup of the lookup join, which happens 50k times for this query.
Jira issue: CRDB-20918
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
A-sql-explainIssues related to EXPLAIN and EXPLAIN ANALYZE improvementsIssues related to EXPLAIN and EXPLAIN ANALYZE improvementsC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-sql-queriesSQL Queries TeamSQL Queries TeamX-nostaleMarks an issue/pr that should be ignored by the stale botMarks an issue/pr that should be ignored by the stale bot
Type
Projects
Status
Backlog