Skip to content

Identical subqueries aren't concatenated and executed many times. #2301

@StarWix

Description

@StarWix

From documentation:

All expressions in a query that have the same AST (the same record or same result of syntactic parsing) are considered to have identical values. Such expressions are concatenated and executed once. Identical subqueries are also eliminated this way.

CREATE TABLE test
(
    date Date DEFAULT now(), 
    x Int64,
    y Int64
)
ENGINE = MergeTree(date, x, 4096);

INSERT INTO test(x, y) VALUES (1, 1), (2, 2), (3, 3);

SELECT x
FROM test 
WHERE (x IN 
(
    SELECT x
    FROM test 
    WHERE x = 2
)) OR (y IN 
(
    SELECT x
    FROM test 
    WHERE x = 2
))
UNION ALL
SELECT x
FROM test 
WHERE x = 2;

In logs I see row <Trace> default.test (SelectExecutor): Reading approx. 4096 rows a four times, but they must be two.

2018.04.28 15:18:31.666589 [ 708944 ] <Debug> executeQuery: (from [::ffff:172.17.0.1]:56494, query_id: 1807c813-3e60-4b7d-849a-82bcca0e9aac) SELECT x FROM test WHERE x IN (SELECT x FROM test WHERE x = 2) OR y IN (SELECT x FROM test WHERE x = 2) UNION ALL SELECT x FROM test WHERE x = 2
2018.04.28 15:18:31.667743 [ 708944 ] <Debug> default.test (SelectExecutor): Key condition: (column 0 in [2, 2])
2018.04.28 15:18:31.667809 [ 708944 ] <Debug> default.test (SelectExecutor): MinMax index condition: unknown
2018.04.28 15:18:31.667871 [ 708944 ] <Debug> default.test (SelectExecutor): Selected 1 parts by date, 1 parts by key, 1 marks to read from 1 ranges
2018.04.28 15:18:31.668023 [ 708944 ] <Trace> default.test (SelectExecutor): Reading approx. 4096 rows
2018.04.28 15:18:31.668106 [ 708944 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2018.04.28 15:18:31.670034 [ 708944 ] <Debug> default.test (SelectExecutor): Key condition: (column 0 in 1-element set), unknown, or
2018.04.28 15:18:31.670107 [ 708944 ] <Debug> default.test (SelectExecutor): MinMax index condition: unknown, unknown, or
2018.04.28 15:18:31.670162 [ 708944 ] <Debug> default.test (SelectExecutor): Selected 1 parts by date, 1 parts by key, 1 marks to read from 1 ranges
2018.04.28 15:18:31.670341 [ 708944 ] <Trace> default.test (SelectExecutor): Reading approx. 4096 rows
2018.04.28 15:18:31.670413 [ 708944 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2018.04.28 15:18:31.671072 [ 708944 ] <Debug> default.test (SelectExecutor): Key condition: (column 0 in [2, 2])
2018.04.28 15:18:31.671132 [ 708944 ] <Debug> default.test (SelectExecutor): MinMax index condition: unknown
2018.04.28 15:18:31.671194 [ 708944 ] <Debug> default.test (SelectExecutor): Selected 1 parts by date, 1 parts by key, 1 marks to read from 1 ranges
2018.04.28 15:18:31.671331 [ 708944 ] <Trace> default.test (SelectExecutor): Reading approx. 4096 rows
2018.04.28 15:18:31.671401 [ 708944 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2018.04.28 15:18:31.674544 [ 708944 ] <Debug> executeQuery: Query pipeline:
Union
 Converting
  CreatingSets
   Lazy
   Expression
    Expression
     Filter
      MergeTreeThread
 Converting
  Expression
   Expression
    Filter
     MergeTreeThread

2018.04.28 15:18:31.675128 [ 708955 ] <Trace> CreatingSetsBlockInputStream: Creating set. 
2018.04.28 15:18:31.675771 [ 708955 ] <Debug> default.test (SelectExecutor): Key condition: (column 0 in [2, 2])
2018.04.28 15:18:31.675836 [ 708955 ] <Debug> default.test (SelectExecutor): MinMax index condition: unknown
2018.04.28 15:18:31.675902 [ 708955 ] <Debug> default.test (SelectExecutor): Selected 1 parts by date, 1 parts by key, 1 marks to read from 1 ranges
2018.04.28 15:18:31.676072 [ 708955 ] <Trace> default.test (SelectExecutor): Reading approx. 4096 rows
2018.04.28 15:18:31.676145 [ 708955 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2018.04.28 15:18:31.676639 [ 708955 ] <Debug> CreatingSetsBlockInputStream: Created. Set with 1 entries from 1 rows. In 0.001 sec.
2018.04.28 15:18:31.677445 [ 708944 ] <Trace> UnionBlockInputStream: Waiting for threads to finish
2018.04.28 15:18:31.677508 [ 708944 ] <Trace> UnionBlockInputStream: Waited for threads to finish
2018.04.28 15:18:31.677767 [ 708944 ] <Information> executeQuery: Read 9 rows, 96.00 B in 0.011 sec., 817 rows/sec., 8.51 KiB/sec.
2018.04.28 15:18:31.677903 [ 708944 ] <Trace> UnionBlockInputStream: Waiting for threads to finish
2018.04.28 15:18:31.677945 [ 708944 ] <Trace> UnionBlockInputStream: Waited for threads to finish
2018.04.28 15:18:31.678197 [ 708944 ] <Debug> MemoryTracker: Peak memory usage (for query): 3.16 MiB.
2018.04.28 15:18:31.678259 [ 708944 ] <Debug> MemoryTracker: Peak memory usage (total): 3.16 MiB.
2018.04.28 15:18:31.678328 [ 708944 ] <Information> TCPHandler: Processed in 0.012 sec.

Additional questions: When can I read about Query pipeline? What do keywords as Expression, CreatingSets and other mean?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions