Skip to content

Conversation

@morrySnow
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

Problem summary

consider the query like this:

SELECT
    k3, k4
FROM
    test
WHERE
    EXISTS( SELECT
            d.*
        FROM
            (SELECT
                k1 AS _1234, SUM(k2)
            FROM
                `test` d
            GROUP BY _1234) d
                LEFT JOIN
            (SELECT
                k1 AS _1234,
                    SUM(k2)
            FROM
                `test`
            GROUP BY _1234) temp ON d._1234 = temp._1234) 
ORDER BY k3, k4

when we analyze group by exprs in temp inline view. we bind the _1234 on d._1234 by mistake.
that because, when we do analyze in a SUB-QUERY, we will resolve SlotRef by itself AND parent's tuple. in the meanwhile, we register child's tuple to parent's analyzer. So, in a SUB-QUERY, the brother's tuple will affect the resolve result of current inlineview's slot.

This PR:

  1. add a flag on the function resolveColumnRef in Analyzer
private TupleDescriptor resolveColumnRef(String colName, boolean requestFromChild);
private TupleDescriptor resolveColumnRef(TableName tblName, String colName, boolean requestByChild);
  1. add a flag to specify whether the tuple is from child.
// alias name -> <from child, tupleDesc>
private final Multimap<String, Pair<Boolean, TupleDescriptor>> tupleByAlias;

when requestByChild == true, we SKIP the tuple from other child to avoid resolve error.

Checklist(Required)

  • Does it affect the original behavior
  • Has unit tests been added
  • Has document been added or modified
  • Does it need to update dependencies
  • Is this PR support rollback (If NO, please explain WHY)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions github-actions bot added area/planner Issues or PRs related to the query planner kind/test labels Mar 15, 2023
@morrySnow morrySnow requested a review from starocean999 March 15, 2023 06:49
@morrySnow morrySnow added usercase Important user case type label dev/1.1.6 dev/1.2.3 labels Mar 15, 2023
@morrySnow
Copy link
Contributor Author

run buildall

starocean999
starocean999 previously approved these changes Mar 15, 2023
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 15, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@hello-stephen
Copy link
Contributor

hello-stephen commented Mar 15, 2023

TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.01 seconds
stream load tsv: 452 seconds loaded 74807831229 Bytes, about 157 MB/s
stream load json: 25 seconds loaded 2358488459 Bytes, about 89 MB/s
stream load orc: 74 seconds loaded 1101869774 Bytes, about 14 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230321122056_clickbench_pr_117969.html

@morrySnow
Copy link
Contributor Author

run buildall

@morrySnow
Copy link
Contributor Author

run buildall

@morrySnow
Copy link
Contributor Author

run buildall

@morrySnow
Copy link
Contributor Author

run buildall

@morrySnow morrySnow merged commit b91a3b5 into apache:master Mar 22, 2023
@morrySnow morrySnow deleted the fix_1822 branch March 22, 2023 03:01
morrySnow added a commit to morrySnow/incubator-doris that referenced this pull request Mar 22, 2023
in PR apache#17813 , we want to forbid bind slot on brother's column
howerver the fix is not in correct way.
the correct way to do that is forbid subquery register itself in parent's analyzer.

This reverts commit b91a3b5.
morrySnow added a commit that referenced this pull request Mar 23, 2023
…arent tuples (#18032)

in PR #17813 , we want to forbid bind slot on brother's column
howerver the fix is not in correct way.
the correct way to do that is forbid subquery register itself in parent's analyzer.

This reverts commit b91a3b5.
morningman pushed a commit that referenced this pull request Mar 28, 2023
…17813)

consider the query like this:
```sql
SELECT
    k3, k4
FROM
    test
WHERE
    EXISTS( SELECT
            d.*
        FROM
            (SELECT
                k1 AS _1234, SUM(k2)
            FROM
                `test` d
            GROUP BY _1234) d
                LEFT JOIN
            (SELECT
                k1 AS _1234,
                    SUM(k2)
            FROM
                `test`
            GROUP BY _1234) temp ON d._1234 = temp._1234)
ORDER BY k3, k4
```

when we analyze group by exprs in `temp` inline view. we bind the `_1234` on `d._1234` by mistake.
that because, when we do analyze in a **SUB-QUERY**, we will resolve SlotRef by itself **AND** parent's tuple. in the meanwhile, we register child's tuple to parent's analyzer. So, in a **SUB-QUERY**, the brother's tuple will affect the resolve result of current inlineview's slot.

This PR:

1. add a flag on the function `resolveColumnRef` in `Analyzer`
```java
private TupleDescriptor resolveColumnRef(String colName, boolean requestFromChild);
private TupleDescriptor resolveColumnRef(TableName tblName, String colName, boolean requestByChild);
```

2. add a flag to specify whether the tuple is from child.
```java
// alias name -> <from child, tupleDesc>
private final Multimap<String, Pair<Boolean, TupleDescriptor>> tupleByAlias;
```

when `requestByChild == true`, we **SKIP** the tuple from other child to avoid resolve error.
morningman pushed a commit that referenced this pull request Mar 28, 2023
…arent tuples (#18032)

in PR #17813 , we want to forbid bind slot on brother's column
howerver the fix is not in correct way.
the correct way to do that is forbid subquery register itself in parent's analyzer.

This reverts commit b91a3b5.
luwei16 pushed a commit to luwei16/incubator-doris that referenced this pull request Apr 7, 2023
…from parent tuples (apache#18032)"

in PR apache#17813 , we want to forbid bind slot on brother's column
howerver the fix is not in correct way.
the correct way to do that is forbid subquery register itself in parent's analyzer.

This reverts commit b91a3b5.
gnehil pushed a commit to gnehil/doris that referenced this pull request Apr 21, 2023
…pache#17813)

consider the query like this:
```sql
SELECT
    k3, k4
FROM
    test
WHERE
    EXISTS( SELECT
            d.*
        FROM
            (SELECT
                k1 AS _1234, SUM(k2)
            FROM
                `test` d
            GROUP BY _1234) d
                LEFT JOIN
            (SELECT
                k1 AS _1234,
                    SUM(k2)
            FROM
                `test`
            GROUP BY _1234) temp ON d._1234 = temp._1234) 
ORDER BY k3, k4
```

when we analyze group by exprs in `temp` inline view. we bind the `_1234` on `d._1234` by mistake.
that because, when we do analyze in a **SUB-QUERY**, we will resolve SlotRef by itself **AND** parent's tuple. in the meanwhile, we register child's tuple to parent's analyzer. So, in a **SUB-QUERY**, the brother's tuple will affect the resolve result of current inlineview's slot.

This PR:

1. add a flag on the function `resolveColumnRef` in `Analyzer`
```java
private TupleDescriptor resolveColumnRef(String colName, boolean requestFromChild);
private TupleDescriptor resolveColumnRef(TableName tblName, String colName, boolean requestByChild);
``` 

2. add a flag to specify whether the tuple is from child.
```java
// alias name -> <from child, tupleDesc>
private final Multimap<String, Pair<Boolean, TupleDescriptor>> tupleByAlias;
```

when `requestByChild == true`, we **SKIP** the tuple from other child to avoid resolve error.
gnehil pushed a commit to gnehil/doris that referenced this pull request Apr 21, 2023
…arent tuples (apache#18032)

in PR apache#17813 , we want to forbid bind slot on brother's column
howerver the fix is not in correct way.
the correct way to do that is forbid subquery register itself in parent's analyzer.

This reverts commit b91a3b5.
mongo360 pushed a commit to mongo360/doris that referenced this pull request Jul 12, 2023
…pache#17813)

consider the query like this:
```sql
SELECT
    k3, k4
FROM
    test
WHERE
    EXISTS( SELECT
            d.*
        FROM
            (SELECT
                k1 AS _1234, SUM(k2)
            FROM
                `test` d
            GROUP BY _1234) d
                LEFT JOIN
            (SELECT
                k1 AS _1234,
                    SUM(k2)
            FROM
                `test`
            GROUP BY _1234) temp ON d._1234 = temp._1234)
ORDER BY k3, k4
```

when we analyze group by exprs in `temp` inline view. we bind the `_1234` on `d._1234` by mistake.
that because, when we do analyze in a **SUB-QUERY**, we will resolve SlotRef by itself **AND** parent's tuple. in the meanwhile, we register child's tuple to parent's analyzer. So, in a **SUB-QUERY**, the brother's tuple will affect the resolve result of current inlineview's slot.

This PR:

1. add a flag on the function `resolveColumnRef` in `Analyzer`
```java
private TupleDescriptor resolveColumnRef(String colName, boolean requestFromChild);
private TupleDescriptor resolveColumnRef(TableName tblName, String colName, boolean requestByChild);
```

2. add a flag to specify whether the tuple is from child.
```java
// alias name -> <from child, tupleDesc>
private final Multimap<String, Pair<Boolean, TupleDescriptor>> tupleByAlias;
```

when `requestByChild == true`, we **SKIP** the tuple from other child to avoid resolve error.
mongo360 pushed a commit to mongo360/doris that referenced this pull request Jul 12, 2023
…arent tuples (apache#18032)

in PR apache#17813 , we want to forbid bind slot on brother's column
howerver the fix is not in correct way.
the correct way to do that is forbid subquery register itself in parent's analyzer.

This reverts commit b91a3b5.
swjtu-zhanglei pushed a commit to swjtu-zhanglei/incubator-doris that referenced this pull request Jul 25, 2023
…from parent tuples (apache#18032)"

in PR apache#17813 , we want to forbid bind slot on brother's column
howerver the fix is not in correct way.
the correct way to do that is forbid subquery register itself in parent's analyzer.

This reverts commit b91a3b5.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/planner Issues or PRs related to the query planner dev/1.1.6 dev/1.2.4-merged kind/test reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants