opt: do not push LIMIT into the scan of a virtual table#79313
opt: do not push LIMIT into the scan of a virtual table#79313craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
Thanks for getting to this! I tagged this PR for backporting |
rharding6373
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
|
I have to look into the GitHub CI failures. I may possibly change the fix to be limited to the GenerateLimitedScans rule, to minimize impact since this is tagged for backport. |
mgartner
left a comment
There was a problem hiding this comment.
Virutal tables are added to the test catalog, so you could add a unit test in pkg/sql/opt/ordering/scan_test.go like:
md := f.Metadata()
tn := tree.NewUnqualifiedTableName("t")
tab := md.AddTable(tc.Table(tn), tn)
+ vtn := tree.MakeTableNameWithSchema("", "pg_catalog", "pg_type")
+ vtab := md.AddTable(tc.Table(&vtn), &vtn)
+ vidx := 1 // The INDEX (oid) of pg_type.
// ...
},
+ { // group 6: scan of virtual table.
+ p: memo.ScanPrivate{
+ Table: vtab,
+ Index: vidx,
+ Cols: opt.MakeColSet(1, 2, 3, 4),
+ },
+ cases: []testCase{
+ {req: "", exp: "fwd", prov: ""}, // case 1
+ {req: "+1", exp: "no", prov: ""}, // case 2
+ {req: "-1", exp: "no", prov: ""}, // case 3
+ {req: "+1,+2", exp: "no", prov: ""}, // case 4
+ {req: "+2", exp: "no", prov: ""}, // case 4
+ },
+ },
}
for gIdx, g := range tests {
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status:complete! 2 of 0 LGTMs obtained (waiting on @msirek)
pkg/sql/opt/ordering/scan.go, line 53 at r1 (raw file):
) (ok bool, reverse bool) { table := md.Table(s.Table) if table.IsVirtualTable() && !required.Any() {
nit: Can you leave a comment here like "A scan on a virtual table index does is not guaranteed to produce values in order, therefore it cannot provide a specific ordering."
mgartner
left a comment
There was a problem hiding this comment.
Oops, I messed up the column IDs above. A test that actually fails on master is:
md := f.Metadata()
tn := tree.NewUnqualifiedTableName("t")
tab := md.AddTable(tc.Table(tn), tn)
+ vtn := tree.MakeTableNameWithSchema("", "pg_catalog", "pg_type")
+ vtab := md.AddTable(tc.Table(&vtn), &vtn)
+ vidx := 1 // The INDEX (oid) of pg_type.
// ...
},
+ { // group 6: scan of virtual table.
+ p: memo.ScanPrivate{
+ Table: vtab,
+ Index: vidx,
+ Cols: opt.MakeColSet(oidCol, oidCol+1),
+ },
+ cases: []testCase{
+ {req: "", exp: "fwd", prov: ""}, // case 1
+ {req: fmt.Sprintf("+%d", oidCol), exp: "no", prov: ""}, // case 2
+ {req: fmt.Sprintf("-%d", oidCol), exp: "no", prov: ""}, // case 3
+ {req: fmt.Sprintf("+%d,+%d", oidCol, oidCol+1), exp: "no", prov: ""}, // case 4
+ {req: fmt.Sprintf("+%d", oidCol+1), exp: "no", prov: ""}, // case 4
+ },
+ },
}
for gIdx, g := range tests {
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @msirek)
mgartner
left a comment
There was a problem hiding this comment.
Where:
+ oidCol := vtab.ColumnID(1)Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @msirek)
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @msirek)
pkg/sql/opt/ordering/scan.go, line 53 at r1 (raw file):
) (ok bool, reverse bool) { table := md.Table(s.Table) if table.IsVirtualTable() && !required.Any() {
Also, the check for !required.Any() shouldn't be necessary, because we should never get here with an any ordering due to this line:
msirek
left a comment
There was a problem hiding this comment.
scan_test.go
Thanks for the tip. I changed the fix to only impact limited scans, so this unit test is no longer applicable. The build/test cycle took way too long with the old fix and lots of query plans were changing. Since this PR is marked for backport, I think it is safer to impact as few internal queries as possible. I saw many merge joins changing to hash joins. It makes me wonder what the impact of using lots of hash joins is. Does it really perform that much worse, or does it cause memory pressure requiring more garbage collection. In any case, please take a look at the latest fix at your convenience. Since virtual indexes can only have one key column, I did not have to handle the SplitLimitedSelectIntoUnionSelects transformation rule.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @rharding6373)
pkg/sql/opt/ordering/scan.go, line 53 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: Can you leave a comment here like "A scan on a virtual table index does is not guaranteed to produce values in order, therefore it cannot provide a specific ordering."
This code is removed.
pkg/sql/opt/ordering/scan.go, line 53 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Also, the check for
!required.Any()shouldn't be necessary, because we should never get here with an any ordering due to this line:
This code is removed.
rytaft
left a comment
There was a problem hiding this comment.
nit: update PR description so that everything below the EXPLAIN plan doesn't also look like code
It would be good to include a logictest as a regression test similar to the one in #78578.
Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @msirek)
pkg/sql/opt/exec/execbuilder/testdata/limit, line 555 at r2 (raw file):
limit: 10 # A limit cannot be pushed into the constraint scan of a virtual table with
nit: constraint -> constrained
mgartner
left a comment
There was a problem hiding this comment.
It would be good to include a logictest as a regression test similar to the one in #78578.
+1
The build/test cycle took way too long with the old fix and lots of query plans were changing.
Interesting. I'd expect merge joins to still be generated with an explicit sort in the optimizer plan - merge joins rely on there being an "interesting ordering" in the left and right side that match the equality columns. And your original change didn't affect interesting orderings. Perhaps the merge joins were still being generated, but were not selected as the optimal plan due to the explicit sort operator? And/or the plans did have merge joins, but there was double sorting during execution (implicitly, then explicitly) which caused things to slow down? 🤷
Reviewed 3 of 4 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @msirek)
pkg/sql/opt/xform/general_funcs.go, line 72 at r2 (raw file):
// IsVirtualTable returns true if the table being scanned is a virtual table. func (c *CustomFuncs) IsVirtualTable(scanPrivate *memo.ScanPrivate) bool {
nit: This function might be better as a method on *memo.ScanPrivate, next to these similar functions:
cockroach/pkg/sql/opt/memo/expr.go
Lines 653 to 669 in 249a962
pkg/sql/opt/xform/limit_funcs.go, line 63 at r2 (raw file):
return false } if c.IsVirtualTable(scanPrivate) && !required.Any() {
nit: Add comment explaining this edge case
pkg/sql/opt/xform/limit_funcs.go, line 92 at r2 (raw file):
grp memo.RelExpr, scanPrivate *memo.ScanPrivate, limit tree.Datum, required props.OrderingChoice, ) { if c.IsVirtualTable(scanPrivate) && !required.Any() {
nit: add comment explaining this edge case
Fixes cockroachdb#78578 Previously, a LIMIT operation could be pushed into the scan of a virtual table with an ORDER BY clause. This was inadequate because in-order scans of virtual indexes aren't supported. When an index that should provide the order requested by a query is used, a sort is actually produced under the covers: ``` EXPLAIN(vec) SELECT oid, typname FROM pg_type ORDER BY OID; info ---------------------------------- │ └ Node 1 └ *colexec.sortOp └ *sql.planNodeToRowSource ``` Functions `CanLimitFilteredScan` and `GenerateLimitedScans` are modified to avoid pushing LIMIT operations into ordered scans of virtual indexes. Release justification: Low risk fix for incorrect results in queries involving virtual system tables. Release note (bug fix): LIMIT queries with an ORDER BY clause which scan the index of a virtual system tables, such as `pg_type`, could previously return incorrect results. This is corrected by teaching the optimizer that LIMIT operations cannot be pushed into ordered scans of virtual indexes.
msirek
left a comment
There was a problem hiding this comment.
Yeah, I had checked a merge join test case for double-sorting, and did not see it, so that can't account for the slow down. I didn't pinpoint the code which automatically adds the sort under the covers, but since virtual indexes appear to have a sort order to the optimizer, that makes merge join possible. So I think it's just that we couldn't pick merge join that caused the slow down. That begs the question, if sort plus merge join is better than hash join in some (or many) cases, maybe we shouldn't require inputs to already be sorted in order to consider merge join plans. Sort-merge join is an industry standard join algorithm. Is it true we currently don't consider sort-merge join? I know it can be forced with a query hint. I think I will take a closer look at this.
I've added a logictest and fixed the PR description.
TFTRs!
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @mgartner, @rharding6373, and @rytaft)
pkg/sql/opt/xform/limit_funcs.go, line 63 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: Add comment explaining this edge case
Added:
// Virtual indexes are not sorted, but are presented to the optimizer as
// sorted, and have a sort automatically applied to them on the output of the
// scan. Since this implicit sort happens after the scan, we can't place a
// hard limit on the scan if the query semantics require a sort order on the
// entire relation.
pkg/sql/opt/xform/limit_funcs.go, line 92 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: add comment explaining this edge case
Added the same comment as above.
pkg/sql/opt/exec/execbuilder/testdata/limit, line 555 at r2 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: constraint -> constrained
fixed
pkg/sql/opt/xform/general_funcs.go, line 72 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: This function might be better as a method on
*memo.ScanPrivate, next to these similar functions:cockroach/pkg/sql/opt/memo/expr.go
Lines 653 to 669 in 249a962
Thanks. Made the change
|
bors r+ |
|
pkg/sql/logictest/testdata/logic_test/limit, line 356 at r3 (raw file):
nit: I think the cast to |
|
Build succeeded: |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 1d82da6 to blathers/backport-release-21.1-79313: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.1.x failed. See errors above. error creating merge commit from 1d82da6 to blathers/backport-release-21.2-79313: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Fixes #78578
Previously, a LIMIT operation could be pushed into the scan of a virtual
table with an ORDER BY clause.
This was inadequate because in-order scans of virtual indexes aren't
supported. When an index that should provide the order requested by a
query is used, a sort is actually produced under the covers:
Functions
CanLimitFilteredScanandGenerateLimitedScansare modifiedto avoid pushing LIMIT operations into ordered scans of virtual indexes.
Release justification: Low risk fix for incorrect results in queries
involving virtual system tables.
Release note (bug fix): LIMIT queries with an ORDER BY clause which scan
the index of a virtual system tables, such as
pg_type, couldpreviously return incorrect results. This is corrected by teaching the
optimizer that LIMIT operations cannot be pushed into ordered scans of
virtual indexes.