-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Delayed reading of columns #45868
Description
Given this query:
SELECT id, view_count, like_count, dislike_count, title FROM youtube ORDER BY view_count DESC LIMIT 100
The column title is very large, and the query takes a long time:
100 rows in set. Elapsed: 26.280 sec. Processed 4.56 billion rows, 500.52 GB (173.38 million rows/s., 19.05 GB/s.)
It will be much faster to read the view_count column first while keeping track of the data parts and row number in the data part where the records are located, ORDER BY and LIMIT the required number of records, then using the same table snapshot, read the remaining columns.
For example, this query:
SELECT view_count FROM youtube ORDER BY view_count DESC LIMIT 100
Only takes:
100 rows in set. Elapsed: 6.920 sec. Processed 4.56 billion rows, 41.01 GB (658.44 million rows/s., 5.93 GB/s.)
Implementation proposal
Introduce a new column type, ColumnLazy, that will store the reference to the table snapshot to use, the list of data parts, and its main content will be (part num, row num) pairs. Multiple ColumnLazy can share the same content (but represent different delayed columns).
If reading is performed from a table of MergeTree family, and the columns are heavily filtered by ORDER BY ... LIMIT or in some other way, and the materialization will be possible on the same server, instruct making ColumnLazy instead of reading the columns. An additional materialization step will be added to the pipeline.