Skip to content

fix(sql): improve order by handling in optimizer#2988

Merged
bluestreak01 merged 4 commits intomasterfrom
bz_fix_order_by_optimizations
Feb 22, 2023
Merged

fix(sql): improve order by handling in optimizer#2988
bluestreak01 merged 4 commits intomasterfrom
bz_fix_order_by_optimizations

Conversation

@bziobrowski
Copy link
Contributor

@bziobrowski bziobrowski commented Feb 14, 2023

Fixes #2330

All sql examples below use demo instance tables .

Changes :

  • improve execution of queries ordered by designated timestamp desc, e.g.
explain select * from trips where pickup_datetime > dateadd('m', -1, now()) order by pickup_datetime desc 
DataFrame
    Row backward scan
    Interval backward scan on: trips
      intervals: [static=[0,9223372036854775807,281481419161601,4294967296] dynamic=[dateadd('m',-1,now())]]
  • improve execution of queries with negative limit and no order by via backward scanning from end of table
explain select * from trips desc limit -10
Sort light
  keys: [ts]
    Limit lo: 5
        DataFrame
            Row backward scan
            Frame backward scan on: trips
  • improve execution of async filter with order by timestamp desc and negative limit by scanning from start and then reordering rows .
explain select * from trips where passenger_count > 0 order by pickup_datetime desc limit -10 ;

Async JIT Filter
  limit: 10
  filter: 0<passenger_count
  workers: 14
    DataFrame
        Row forward scan
        Frame forward scan on: trips
  • prevent copying of limit advice into distinct 's subquery. The following queries apply limit 10 to async jit filter executed before distinct :
select distinct passenger_count from trips where passenger_count < 5 limit 10
select distinct pickup_datetime from trips where passenger_count < 5 limit 10

Explain shows:

Limit lo: 10
    DistinctTimeSeries
      keys: pickup_datetime
        SelectedRecord
            Async JIT Filter
              limit: 10 <-- WRONG
...
  • reduce need for explicit timestamp(column) function use by maintaining timestamp metadata on order by ts, col1, col2, ... .
    For example - the following query doesn't need timestamp(pickup_datetime) anymore :
select * from 
(select * from trips order by pickup_datetime asc, passenger_count limit 10) 
lt join 
(select * from trips) order by pickup_datetime asc
  • avoid unnecessary sorting by designated timestamp, e.g.
explain select * from (select * from trips order by pickup_datetime desc) t1 cross join trips t2 order by t1.pickup_datetime
SelectedRecord
    Cross Join
        DataFrame
            Row forward scan
            Frame forward scan on: trips
        DataFrame
            Row forward scan
            Frame forward scan on: trips

@bluestreak01
Copy link
Member

This is awesome PR @bziobrowski

@ideoma
Copy link
Collaborator

ideoma commented Feb 22, 2023

[PR Coverage check]

😍 pass : 120 / 121 (99.17%)

file detail

path covered line new line coverage
🔵 io/questdb/cairo/IntervalBwdDataFrameCursorFactory.java 2 3 66.67%
🔵 io/questdb/cairo/FullBwdDataFrameCursorFactory.java 6 6 100.00%
🔵 io/questdb/griffin/engine/table/AsyncFilteredNegativeLimitRecordCursor.java 7 7 100.00%
🔵 io/questdb/griffin/engine/join/NestedLoopLeftJoinRecordCursorFactory.java 1 1 100.00%
🔵 io/questdb/griffin/engine/union/ExceptRecordCursorFactory.java 1 1 100.00%
🔵 io/questdb/griffin/engine/orderby/SortedLightRecordCursorFactory.java 1 1 100.00%
🔵 io/questdb/griffin/engine/orderby/LimitedSizeSortedLightRecordCursorFactory.java 1 1 100.00%
🔵 io/questdb/griffin/engine/join/CrossJoinRecordCursorFactory.java 1 1 100.00%
🔵 io/questdb/griffin/engine/union/IntersectRecordCursorFactory.java 1 1 100.00%
🔵 io/questdb/griffin/engine/table/AsyncJitFilteredRecordCursorFactory.java 2 2 100.00%
🔵 io/questdb/griffin/engine/orderby/SortedRecordCursorFactory.java 6 6 100.00%
🔵 io/questdb/griffin/engine/LimitRecordCursorFactory.java 1 1 100.00%
🔵 io/questdb/griffin/SqlCodeGenerator.java 12 12 100.00%
🔵 io/questdb/griffin/SqlOptimiser.java 73 73 100.00%
🔵 io/questdb/griffin/engine/table/AsyncFilteredRecordCursorFactory.java 2 2 100.00%
🔵 io/questdb/griffin/engine/table/BwdDataFrameRowCursorFactory.java 3 3 100.00%

@bluestreak01 bluestreak01 merged commit da2856c into master Feb 22, 2023
@bluestreak01 bluestreak01 deleted the bz_fix_order_by_optimizations branch February 22, 2023 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

slow limit -N queries without order by ts queries

3 participants