Skip to content

opt: do not push LIMIT into the scan of a virtual table#79313

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
msirek:78578
Apr 5, 2022
Merged

opt: do not push LIMIT into the scan of a virtual table#79313
craig[bot] merged 1 commit intocockroachdb:masterfrom
msirek:78578

Conversation

@msirek
Copy link
Copy Markdown
Contributor

@msirek msirek commented Apr 4, 2022

Fixes #78578

Previously, a LIMIT operation could be pushed into the scan of a virtual
table with an ORDER BY clause.

This was inadequate because in-order scans of virtual indexes aren't
supported. When an index that should provide the order requested by a
query is used, a sort is actually produced under the covers:

EXPLAIN(vec)
SELECT oid, typname FROM pg_type ORDER BY OID;
               info
----------------------------------
  │
  └ Node 1
    └ *colexec.sortOp
      └ *sql.planNodeToRowSource

Functions CanLimitFilteredScan and GenerateLimitedScans are modified
to avoid pushing LIMIT operations into ordered scans of virtual indexes.

Release justification: Low risk fix for incorrect results in queries
involving virtual system tables.

Release note (bug fix): LIMIT queries with an ORDER BY clause which scan
the index of a virtual system tables, such as pg_type, could
previously return incorrect results. This is corrected by teaching the
optimizer that LIMIT operations cannot be pushed into ordered scans of
virtual indexes.

@msirek msirek requested review from a team, mgartner and rharding6373 April 4, 2022 07:23
@msirek msirek requested a review from a team as a code owner April 4, 2022 07:23
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@rafiss
Copy link
Copy Markdown
Collaborator

rafiss commented Apr 4, 2022

Thanks for getting to this! I tagged this PR for backporting

Copy link
Copy Markdown
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: nice!

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)

@msirek
Copy link
Copy Markdown
Contributor Author

msirek commented Apr 4, 2022

I have to look into the GitHub CI failures. I may possibly change the fix to be limited to the GenerateLimitedScans rule, to minimize impact since this is tagged for backport.

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! :lgtm:

Virutal tables are added to the test catalog, so you could add a unit test in pkg/sql/opt/ordering/scan_test.go like:

  md := f.Metadata()
  tn := tree.NewUnqualifiedTableName("t")
  tab := md.AddTable(tc.Table(tn), tn)
+ vtn := tree.MakeTableNameWithSchema("", "pg_catalog", "pg_type")
+ vtab := md.AddTable(tc.Table(&vtn), &vtn)
+ vidx := 1 // The INDEX (oid) of pg_type.

// ...
 
     },
+    { // group 6: scan of virtual table.
+      p: memo.ScanPrivate{
+        Table: vtab,
+        Index: vidx,
+        Cols:  opt.MakeColSet(1, 2, 3, 4),
+      },
+      cases: []testCase{
+        {req: "", exp: "fwd", prov: ""},     // case 1
+        {req: "+1", exp: "no", prov: ""},    // case 2
+        {req: "-1", exp: "no", prov: ""},    // case 3
+        {req: "+1,+2", exp: "no", prov: ""}, // case 4
+        {req: "+2", exp: "no", prov: ""},    // case 4
+      },
+    },
   }
 
   for gIdx, g := range tests {

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @msirek)


pkg/sql/opt/ordering/scan.go, line 53 at r1 (raw file):

) (ok bool, reverse bool) {
	table := md.Table(s.Table)
	if table.IsVirtualTable() && !required.Any() {

nit: Can you leave a comment here like "A scan on a virtual table index does is not guaranteed to produce values in order, therefore it cannot provide a specific ordering."

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I messed up the column IDs above. A test that actually fails on master is:

  md := f.Metadata()
  tn := tree.NewUnqualifiedTableName("t")
  tab := md.AddTable(tc.Table(tn), tn)
+ vtn := tree.MakeTableNameWithSchema("", "pg_catalog", "pg_type")
+ vtab := md.AddTable(tc.Table(&vtn), &vtn)
+ vidx := 1 // The INDEX (oid) of pg_type.

// ...

     },
+    { // group 6: scan of virtual table.
+      p: memo.ScanPrivate{
+        Table: vtab,
+        Index: vidx,
+        Cols:  opt.MakeColSet(oidCol, oidCol+1),
+      },
+      cases: []testCase{
+        {req: "", exp: "fwd", prov: ""},                                      // case 1
+        {req: fmt.Sprintf("+%d", oidCol), exp: "no", prov: ""},               // case 2
+        {req: fmt.Sprintf("-%d", oidCol), exp: "no", prov: ""},               // case 3
+        {req: fmt.Sprintf("+%d,+%d", oidCol, oidCol+1), exp: "no", prov: ""}, // case 4
+        {req: fmt.Sprintf("+%d", oidCol+1), exp: "no", prov: ""},             // case 4
+      },
+    },
   }
 
   for gIdx, g := range tests {

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @msirek)

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where:

+ 	oidCol := vtab.ColumnID(1)

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @msirek)

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @msirek)


pkg/sql/opt/ordering/scan.go, line 53 at r1 (raw file):

) (ok bool, reverse bool) {
	table := md.Table(s.Table)
	if table.IsVirtualTable() && !required.Any() {

Also, the check for !required.Any() shouldn't be necessary, because we should never get here with an any ordering due to this line:

if required.Any() {

Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scan_test.go

Thanks for the tip. I changed the fix to only impact limited scans, so this unit test is no longer applicable. The build/test cycle took way too long with the old fix and lots of query plans were changing. Since this PR is marked for backport, I think it is safer to impact as few internal queries as possible. I saw many merge joins changing to hash joins. It makes me wonder what the impact of using lots of hash joins is. Does it really perform that much worse, or does it cause memory pressure requiring more garbage collection. In any case, please take a look at the latest fix at your convenience. Since virtual indexes can only have one key column, I did not have to handle the SplitLimitedSelectIntoUnionSelects transformation rule.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @rharding6373)


pkg/sql/opt/ordering/scan.go, line 53 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: Can you leave a comment here like "A scan on a virtual table index does is not guaranteed to produce values in order, therefore it cannot provide a specific ordering."

This code is removed.


pkg/sql/opt/ordering/scan.go, line 53 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Also, the check for !required.Any() shouldn't be necessary, because we should never get here with an any ordering due to this line:

if required.Any() {

This code is removed.

Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: update PR description so that everything below the EXPLAIN plan doesn't also look like code

It would be good to include a logictest as a regression test similar to the one in #78578.

Otherwise, :lgtm:

Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @msirek)


pkg/sql/opt/exec/execbuilder/testdata/limit, line 555 at r2 (raw file):

  limit: 10

# A limit cannot be pushed into the constraint scan of a virtual table with

nit: constraint -> constrained

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to include a logictest as a regression test similar to the one in #78578.

+1

The build/test cycle took way too long with the old fix and lots of query plans were changing.

Interesting. I'd expect merge joins to still be generated with an explicit sort in the optimizer plan - merge joins rely on there being an "interesting ordering" in the left and right side that match the equality columns. And your original change didn't affect interesting orderings. Perhaps the merge joins were still being generated, but were not selected as the optimal plan due to the explicit sort operator? And/or the plans did have merge joins, but there was double sorting during execution (implicitly, then explicitly) which caused things to slow down? 🤷

Reviewed 3 of 4 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @msirek)


pkg/sql/opt/xform/general_funcs.go, line 72 at r2 (raw file):

// IsVirtualTable returns true if the table being scanned is a virtual table.
func (c *CustomFuncs) IsVirtualTable(scanPrivate *memo.ScanPrivate) bool {

nit: This function might be better as a method on *memo.ScanPrivate, next to these similar functions:

// IsUnfiltered returns true if the ScanPrivate will produce all rows in the
// table.
func (s *ScanPrivate) IsUnfiltered(md *opt.Metadata) bool {
return (s.Constraint == nil || s.Constraint.IsUnconstrained()) &&
s.InvertedConstraint == nil &&
s.HardLimit == 0 &&
s.PartialIndexPredicate(md) == nil
}
// IsFullIndexScan returns true if the ScanPrivate will produce all rows in the
// index.
func (s *ScanPrivate) IsFullIndexScan(md *opt.Metadata) bool {
return (s.Constraint == nil || s.Constraint.IsUnconstrained()) &&
s.InvertedConstraint == nil &&
s.HardLimit == 0
}


pkg/sql/opt/xform/limit_funcs.go, line 63 at r2 (raw file):

		return false
	}
	if c.IsVirtualTable(scanPrivate) && !required.Any() {

nit: Add comment explaining this edge case


pkg/sql/opt/xform/limit_funcs.go, line 92 at r2 (raw file):

	grp memo.RelExpr, scanPrivate *memo.ScanPrivate, limit tree.Datum, required props.OrderingChoice,
) {
	if c.IsVirtualTable(scanPrivate) && !required.Any() {

nit: add comment explaining this edge case

Fixes cockroachdb#78578

Previously, a LIMIT operation could be pushed into the scan of a virtual
table with an ORDER BY clause.

This was inadequate because in-order scans of virtual indexes aren't
supported. When an index that should provide the order requested by a
query is used, a sort is actually produced under the covers:
```
EXPLAIN(vec)
SELECT oid, typname FROM pg_type ORDER BY OID;
               info
----------------------------------
  │
  └ Node 1
    └ *colexec.sortOp
      └ *sql.planNodeToRowSource

```
Functions `CanLimitFilteredScan` and `GenerateLimitedScans` are modified
to avoid pushing LIMIT operations into ordered scans of virtual indexes.

Release justification: Low risk fix for incorrect results in queries
involving virtual system tables.

Release note (bug fix): LIMIT queries with an ORDER BY clause which scan
the index of a virtual system tables, such as `pg_type`, could
previously return incorrect results. This is corrected by teaching the
optimizer that LIMIT operations cannot be pushed into ordered scans of
virtual indexes.
Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had checked a merge join test case for double-sorting, and did not see it, so that can't account for the slow down. I didn't pinpoint the code which automatically adds the sort under the covers, but since virtual indexes appear to have a sort order to the optimizer, that makes merge join possible. So I think it's just that we couldn't pick merge join that caused the slow down. That begs the question, if sort plus merge join is better than hash join in some (or many) cases, maybe we shouldn't require inputs to already be sorted in order to consider merge join plans. Sort-merge join is an industry standard join algorithm. Is it true we currently don't consider sort-merge join? I know it can be forced with a query hint. I think I will take a closer look at this.

I've added a logictest and fixed the PR description.
TFTRs!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @mgartner, @rharding6373, and @rytaft)


pkg/sql/opt/xform/limit_funcs.go, line 63 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: Add comment explaining this edge case

Added:

	// Virtual indexes are not sorted, but are presented to the optimizer as
	// sorted, and have a sort automatically applied to them on the output of the
	// scan. Since this implicit sort happens after the scan, we can't place a
	// hard limit on the scan if the query semantics require a sort order on the
	// entire relation.

pkg/sql/opt/xform/limit_funcs.go, line 92 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: add comment explaining this edge case

Added the same comment as above.


pkg/sql/opt/exec/execbuilder/testdata/limit, line 555 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: constraint -> constrained

fixed


pkg/sql/opt/xform/general_funcs.go, line 72 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: This function might be better as a method on *memo.ScanPrivate, next to these similar functions:

// IsUnfiltered returns true if the ScanPrivate will produce all rows in the
// table.
func (s *ScanPrivate) IsUnfiltered(md *opt.Metadata) bool {
return (s.Constraint == nil || s.Constraint.IsUnconstrained()) &&
s.InvertedConstraint == nil &&
s.HardLimit == 0 &&
s.PartialIndexPredicate(md) == nil
}
// IsFullIndexScan returns true if the ScanPrivate will produce all rows in the
// index.
func (s *ScanPrivate) IsFullIndexScan(md *opt.Metadata) bool {
return (s.Constraint == nil || s.Constraint.IsUnconstrained()) &&
s.InvertedConstraint == nil &&
s.HardLimit == 0
}

Thanks. Made the change

@msirek
Copy link
Copy Markdown
Contributor Author

msirek commented Apr 5, 2022

bors r+

@mgartner
Copy link
Copy Markdown
Contributor

mgartner commented Apr 5, 2022

pkg/sql/logictest/testdata/logic_test/limit, line 356 at r3 (raw file):


query IT
SELECT oid::INT, typname FROM pg_type ORDER BY oid LIMIT 3

nit: I think the cast to oid::INT is preventing the limit from being pushed down (I'm not sure why exactly), so this regression test doesn't fail before the fix. I realize you already bors'd though, so up to you to fix it in a future PR if you want.

demo@127.0.0.1:26257/defaultdb> explain SELECT oid::INT, typname FROM pg_type ORDER BY oid LIMIT 3;
                  info
----------------------------------------
  distribution: local
  vectorized: true

  • limit
  │ count: 3
  │
  └── • sort
      │ order: +oid
      │
      └── • render
          │
          └── • virtual table
                table: pg_type@primary
(13 rows)


Time: 1ms total (execution 0ms / network 0ms)

demo@127.0.0.1:26257/defaultdb> explain SELECT oid, typname FROM pg_type ORDER BY oid LIMIT 3;
                info
------------------------------------
  distribution: local
  vectorized: true

  • virtual table
    table: pg_type@pg_type_oid_idx
    limit: 3
(6 rows)

@craig craig bot merged commit 985344a into cockroachdb:master Apr 5, 2022
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 5, 2022

Build succeeded:

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 5, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 1d82da6 to blathers/backport-release-21.1-79313: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.1.x failed. See errors above.


error creating merge commit from 1d82da6 to blathers/backport-release-21.2-79313: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: ORDER BY does not work on pg_type table when sorting by virtual index key

6 participants