Skip to content

Support subquery in disjunction #2830

@EmmyMiao87

Description

@EmmyMiao87

Is your feature request related to a problem? Please describe.
Doris does not support the subquery in disjunction.
For example:

SELECT *
FROM customer c
WHERE   EXISTS
        (
              SELECT *
              FROM web_sales
              WHERE c.c_customer_sk = ws_bill_customer_sk
        )
        OR EXISTS
        (
              SELECT *
              FROM catalog_sales
              WHERE c.c_customer_sk = cs_ship_customer_sk

        );
SELECT *
FROM  customer_address, item
WHERE
    substr(ca_zip, 1, 5)

    IN

    (
         '85669'
    )
    OR i_item_id

    IN

    (
          SELECT i_item_id
          FROM item
          WHERE i_item_sk IN (2)
    );

Subqueries in OR predicates are not supported

TPC-DS 10,35,45 query:

query10:
SELECT cd_gender,
      cd_marital_status,
      cd_education_status,
      count(*) cnt1,
      cd_purchase_estimate,
      count(*) cnt2,
      cd_credit_rating,
      count(*) cnt3,
      cd_dep_count,
      count(*) cnt4,
      cd_dep_employed_count,
      count(*) cnt5,
      cd_dep_college_count,
      count(*) cnt6
FROM customer c,customer_address ca,customer_demographics
WHERE c.c_current_addr_sk = ca.ca_address_sk
      AND ca_county IN ('Walker County','Richland County','Gaines County','Douglas County','Dona Ana County')
      AND cd_demo_sk = c.c_current_cdemo_sk
      AND EXISTS
      (

            SELECT *
            FROM store_sales,date_dim
            WHERE c.c_customer_sk = ss_customer_sk
                  AND ss_sold_date_sk = d_date_sk
                  AND d_year = 2002
                 AND d_moy
                  BETWEEN 4
                  AND 4+3

      )
      AND

      (

            EXISTS
            (

                  SELECT *
                  FROM web_sales,date_dim
                  WHERE c.c_customer_sk = ws_bill_customer_sk
                        AND ws_sold_date_sk = d_date_sk
                        AND d_year = 2002
                        AND d_moy
                  BETWEEN 4
                  AND 4+3

            )
            OR EXISTS
            (

                  SELECT *
                  FROM catalog_sales,date_dim
                  WHERE c.c_customer_sk = cs_ship_customer_sk
                        AND cs_sold_date_sk = d_date_sk
                        AND d_year = 2002
                        AND d_moy
                        BETWEEN 4
                        AND 4+3

            )

      )
GROUP BY cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating, cd_dep_count, cd_dep_employed_count, cd_dep_college_count
ORDER BY cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating, cd_dep_count, cd_dep_employed_count, cd_dep_college_count

LIMIT 100;
query 35:
SELECT ca_state,
      cd_gender,
      cd_marital_status,
      cd_dep_count,
      count(*) cnt1,
      avg(cd_dep_count),
      max(cd_dep_count),
      sum(cd_dep_count),
      cd_dep_employed_count,
      count(*) cnt2,
      avg(cd_dep_employed_count),
      max(cd_dep_employed_count),
      sum(cd_dep_employed_count),
      cd_dep_college_count,
      count(*) cnt3,
      avg(cd_dep_college_count),
      max(cd_dep_college_count),
      sum(cd_dep_college_count)
FROM customer c,customer_address ca,customer_demographics
WHERE c.c_current_addr_sk = ca.ca_address_sk
      AND cd_demo_sk = c.c_current_cdemo_sk
      AND EXISTS
      (

            SELECT *
            FROM store_sales,date_dim
            WHERE c.c_customer_sk = ss_customer_sk
                  AND ss_sold_date_sk = d_date_sk
                  AND d_year = 1999
                  AND d_qoy < 4

      )
      AND

      (

            EXISTS
            (

                  SELECT *
                  FROM web_sales,date_dim
                  WHERE c.c_customer_sk = ws_bill_customer_sk
                        AND ws_sold_date_sk = d_date_sk
                        AND d_year = 1999
                        AND d_qoy < 4

            )
            OR EXISTS
            (

                  SELECT *
                  FROM catalog_sales,date_dim
                  WHERE c.c_customer_sk = cs_ship_customer_sk
                        AND cs_sold_date_sk = d_date_sk
                        AND d_year = 1999
                        AND d_qoy < 4

            )

      )
GROUP BY ca_state, cd_gender, cd_marital_status, cd_dep_count, cd_dep_employed_count, cd_dep_college_count
ORDER BY ca_state, cd_gender, cd_marital_status, cd_dep_count, cd_dep_employed_count, cd_dep_college_count

LIMIT 100;
Query 45:
SELECT ca_zip, ca_county, SUM(ws_sales_price)
FROM web_sales, customer, customer_address, date_dim, item
WHERE

      (

            ws_bill_customer_sk = c_customer_sk
           AND c_current_addr_sk = ca_address_sk
           AND ws_item_sk = i_item_sk
           AND

           (

                  substr(ca_zip, 1, 5)

                  IN

                  (
                       '85669',
                       '86197',
                       '88274',
                       '83405',
                        '86475',
                        '85392',
                        '85460',
                       '80348',
                        '81792'
                 )
                 OR i_item_id

                  IN

                  (
                        SELECT i_item_id
                        FROM item
                        WHERE i_item_sk IN (2, 3, 5, 7, 11, 13, 17, 19, 23, 29)
                 )

            )
            AND ws_sold_date_sk = d_date_sk
           AND d_qoy = 2
           AND d_year = 2000

      )
GROUP BY ca_zip, ca_county
ORDER BY ca_zip, ca_county
LIMIT 100;

Describe the solution you'd like
The subquery could be supported inn disjunction.
Select * from t1 where a in subquery1 or a b in subquery2.

Competitive analysis
Greenplum:
Impala:

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/sql/compatibilityIssues or PRs related to the SQL compatibililty

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions