Skip to content

execbuilder: elide redundant locking lookup join in some simple cases#114401

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
michae2:sfu_optimization
Dec 6, 2023
Merged

execbuilder: elide redundant locking lookup join in some simple cases#114401
craig[bot] merged 2 commits intocockroachdb:masterfrom
michae2:sfu_optimization

Conversation

@michae2
Copy link
Copy Markdown
Collaborator

@michae2 michae2 commented Nov 14, 2023

execbuilder: elide redundant locking lookup join in some simple cases

In some simple cases of SELECT FOR UPDATE using the new Lock operator, we can elide the locking lookup join and instead push the locking down into the input.

To optimize more complex cases we'll need to turn locking into a physical property which is required by the new Lock operator, similar to ordering.

Fixes: #114566

Release note: None


opt: add read committed variant of tpcc opttest

Add a copy of the tpcc opttest, but with read committed isolation. This shows the locking differences between plans built for serializable and plans built for read committed.

This doesn't show the optimization in the previous commit, because it applies during execbuilder.

Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Nov 16, 2023

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@michae2 michae2 marked this pull request as ready for review November 16, 2023 14:55
@michae2 michae2 requested a review from a team as a code owner November 16, 2023 14:55
@michae2 michae2 requested review from DrewKimball, nvb and yuzefovich and removed request for a team and yuzefovich November 16, 2023 14:55
@michae2 michae2 added the backport-23.2.x PAST MAINTENANCE SUPPORT: 23.2 patch releases via ER request only label Nov 16, 2023
@michae2 michae2 requested a review from mgartner November 16, 2023 14:56
@michae2
Copy link
Copy Markdown
Collaborator Author

michae2 commented Nov 16, 2023

I wanted to show a benchmark, but was failing to get the TPC-C roachtest running last night from my gceworker. I'll try to do that again sometime today.

@michae2
Copy link
Copy Markdown
Collaborator Author

michae2 commented Nov 16, 2023

@DrewKimball asked whether this should be an optimization rule, and also whether we would like more optimization rules for SFU, to allow the optimizer to choose plans that don't have the lookup join. That would allow us to avoid the lookup join in more cases.

I agree that an optimization rule would be more optimal. We can't use an optimization rule that removes the Lock operator, however, as we need that to act as an optimization barrier to prevent removing the locking side effect during exploration. Drew suggested an optimization rule that changed Lock's locking to ForNone but kept it in the plan. I am considering this, though it would make this PR a little bigger.

If it helps, here was my thinking for next steps before this suggestion:

  • Backport this PR to 23.2.
  • Revert this PR in 24.1 and instead make locking a physical property that is required by the Lock operator and provided by scan, index join, and lookup join, with lookup join as an enforcer. I think this will allow the optimizer to pick plans without the lookup join whenever optimal, while keeping Lock as an optimization barrier in the plan.

Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1, 3 of 3 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @nvanbenschoten)


pkg/sql/opt/exec/execbuilder/mutation.go line 1173 at r1 (raw file):

	case *memo.LookupJoinExpr:
		if input.Table == lock.Table && input.Index == cat.PrimaryIndex &&
			input.JoinType == opt.InnerJoinOp && len(input.On) == 0 {

I think the tableID check ensures that the key columns are the same in practice, but it would be best to verify that.

We should probably check IsFirstJoinInPairedJoiner and IsSecondJoinInPairedJoiner as well.

Also, I'm assuming it's ok to lock the same row twice? There's no guarantee here and for the index-join case that the input doesn't join to the same row twice.


pkg/sql/opt/exec/execbuilder/mutation.go line 1173 at r1 (raw file):

	case *memo.LookupJoinExpr:
		if input.Table == lock.Table && input.Index == cat.PrimaryIndex &&
			input.JoinType == opt.InnerJoinOp && len(input.On) == 0 {

We could probably also allow semi-join and left-join here.


pkg/sql/opt/exec/execbuilder/testdata/select_for_update line 2752 at r1 (raw file):

# Push the locking down into the lookup join.
query T
EXPLAIN (VERBOSE) SELECT * FROM t JOIN u ON u.a = t.b WHERE t.a = 5 FOR UPDATE OF u

Another test idea: a lookup join where the input has duplicates, so the same row gets looked up more than once.


pkg/sql/opt/exec/execbuilder/testdata/select_for_update line 2800 at r1 (raw file):

          table: t@t_pkey
          spans: /5/0

Can we also test the following?

  1. Scan/lookup-join on a non-primary index.
  2. Scan/index-join/lookup-join on a different table.
  3. There is a lookup-join between a table and itself.
    a. The right side of the join is locked.
    b. The left side of the join is locked.

Copy link
Copy Markdown
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @nvanbenschoten)


pkg/sql/opt/exec/execbuilder/mutation.go line 1173 at r1 (raw file):

I think the tableID check ensures that the key columns are the same in practice, but it would be best to verify that.

Maybe I'm misunderstanding, but I don't think we need the key columns to be the same. In fact, I wouldn't expect them to ever be the same due to the tableID check. For example, suppose we had a query like the following:

CREATE TABLE ab (a INT PRIMARY KEY, b INT);
CREATE TABLE cde (c INT, d INT, e INT, PRIMARY KEY (c, d));
SET optimizer_use_lock_op_for_serializable = true;
SELECT * FROM ab JOIN cde ON c = b WHERE a = 1 FOR UPDATE OF cde;

Here's what the explain looks like before the optimization:

demo@127.0.0.1:26257/demoapp/defaultdb> EXPLAIN (OPT, VERBOSE) SELECT * FROM ab JOIN cde ON c = b WHERE a = 1 FOR UPDATE OF cde;
                                              info
------------------------------------------------------------------------------------------------
  lock cde
   ├── columns: a:1 b:2 c:5 d:6 e:7
   ├── locking: for-update
   ├── volatile, mutations
   ├── stats: [rows=9.9]
   ├── cost: 54.217
   ├── key: (6)
   ├── fd: ()-->(1,2,5), (6)-->(7), (2)==(5), (5)==(2)
   ├── distribution: us-east1
   └── inner-join (lookup cde)
        ├── columns: a:1 b:2 c:5 d:6 e:7
        ├── key columns: [2] = [5]
        ├── stats: [rows=9.9, distinct(2)=1e-10, null(2)=0, distinct(5)=1e-10, null(5)=0]
        ├── cost: 54.197
        ├── key: (6)
        ├── fd: ()-->(1,2,5), (6)-->(7), (2)==(5), (5)==(2)
        ├── distribution: us-east1
        ├── scan ab
        │    ├── columns: a:1 b:2
        │    ├── constraint: /1: [/1 - /1]
        │    ├── cardinality: [0 - 1]
        │    ├── stats: [rows=1, distinct(1)=1, null(1)=0, distinct(2)=0.995512, null(2)=0.01]
        │    ├── cost: 9.09
        │    ├── key: ()
        │    ├── fd: ()-->(1,2)
        │    ├── distribution: us-east1
        │    └── prune: (2)
        └── filters (true)
(28 rows)

In this case the key columns of the inner lookup join are [2] and the key columns of the locking semi lookup join are [5, 6] (not printed) (which originate from the inner lookup join, ensured by the tableID check). We don't need the key columns to be the same to know that the locking lookup join will lock the same set of PKs as produced by the inner lookup join. We just need the inner lookup join to be on the same primary index, and not have ON conditions, and not be an anti-join.

We should probably check IsFirstJoinInPairedJoiner and IsSecondJoinInPairedJoiner as well.

Done.

Also, I'm assuming it's ok to lock the same row twice?

Yes (locking the same row twice was possible before the optimization, too). I've added a case to pkg/sql/logictest/testdata/logic_test/select_for_update_read_committed that does this.


pkg/sql/opt/exec/execbuilder/mutation.go line 1173 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

We could probably also allow semi-join and left-join here.

Good point. Added semi (although I'm having trouble producing it due to #114712).

Won't left joins be disallowed when I fix #97434?


pkg/sql/opt/exec/execbuilder/testdata/select_for_update line 2752 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Another test idea: a lookup join where the input has duplicates, so the same row gets looked up more than once.

Done (in pkg/sql/logictest/testdata/logic_test/select_for_update_read_committed).


pkg/sql/opt/exec/execbuilder/testdata/select_for_update line 2800 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Can we also test the following?

  1. Scan/lookup-join on a non-primary index.
  2. Scan/index-join/lookup-join on a different table.
  3. There is a lookup-join between a table and itself.
    a. The right side of the join is locked.
    b. The left side of the join is locked.

Did all except scan on a different table. (Is it possible to produce a plan with scans of two different tables without a join on top of the scans?)

@mgartner mgartner requested a review from DrewKimball November 22, 2023 16:09
Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r1, 6 of 6 files at r3, 1 of 3 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @nvanbenschoten)


pkg/sql/opt/exec/execbuilder/mutation.go line 1173 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

I think the tableID check ensures that the key columns are the same in practice, but it would be best to verify that.

Maybe I'm misunderstanding, but I don't think we need the key columns to be the same. In fact, I wouldn't expect them to ever be the same due to the tableID check. For example, suppose we had a query like the following:

CREATE TABLE ab (a INT PRIMARY KEY, b INT);
CREATE TABLE cde (c INT, d INT, e INT, PRIMARY KEY (c, d));
SET optimizer_use_lock_op_for_serializable = true;
SELECT * FROM ab JOIN cde ON c = b WHERE a = 1 FOR UPDATE OF cde;

Here's what the explain looks like before the optimization:

demo@127.0.0.1:26257/demoapp/defaultdb> EXPLAIN (OPT, VERBOSE) SELECT * FROM ab JOIN cde ON c = b WHERE a = 1 FOR UPDATE OF cde;
                                              info
------------------------------------------------------------------------------------------------
  lock cde
   ├── columns: a:1 b:2 c:5 d:6 e:7
   ├── locking: for-update
   ├── volatile, mutations
   ├── stats: [rows=9.9]
   ├── cost: 54.217
   ├── key: (6)
   ├── fd: ()-->(1,2,5), (6)-->(7), (2)==(5), (5)==(2)
   ├── distribution: us-east1
   └── inner-join (lookup cde)
        ├── columns: a:1 b:2 c:5 d:6 e:7
        ├── key columns: [2] = [5]
        ├── stats: [rows=9.9, distinct(2)=1e-10, null(2)=0, distinct(5)=1e-10, null(5)=0]
        ├── cost: 54.197
        ├── key: (6)
        ├── fd: ()-->(1,2,5), (6)-->(7), (2)==(5), (5)==(2)
        ├── distribution: us-east1
        ├── scan ab
        │    ├── columns: a:1 b:2
        │    ├── constraint: /1: [/1 - /1]
        │    ├── cardinality: [0 - 1]
        │    ├── stats: [rows=1, distinct(1)=1, null(1)=0, distinct(2)=0.995512, null(2)=0.01]
        │    ├── cost: 9.09
        │    ├── key: ()
        │    ├── fd: ()-->(1,2)
        │    ├── distribution: us-east1
        │    └── prune: (2)
        └── filters (true)
(28 rows)

In this case the key columns of the inner lookup join are [2] and the key columns of the locking semi lookup join are [5, 6] (not printed) (which originate from the inner lookup join, ensured by the tableID check). We don't need the key columns to be the same to know that the locking lookup join will lock the same set of PKs as produced by the inner lookup join. We just need the inner lookup join to be on the same primary index, and not have ON conditions, and not be an anti-join.

We should probably check IsFirstJoinInPairedJoiner and IsSecondJoinInPairedJoiner as well.

Done.

Also, I'm assuming it's ok to lock the same row twice?

Yes (locking the same row twice was possible before the optimization, too). I've added a case to pkg/sql/logictest/testdata/logic_test/select_for_update_read_committed that does this.

Is it safe to push the locks into locality-optimized lookup joins (e.g., when RemoteLookupExpr FiltersExpr is non-nil, or when LocalityOptimized bool is true)? I believe we are free to not lock rows if we can answer the query without scanning them, so I think not locking remote rows in a LOS when we find all the rows locally should be ok, correct?

Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r3, 3 of 3 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @nvanbenschoten)


pkg/sql/opt/exec/execbuilder/mutation.go line 1173 at r1 (raw file):

Won't left joins be disallowed when I fix opt: don't allow locking null-extended rows #97434?

Good point, maybe we shouldn't worry about that then.


pkg/sql/opt/exec/execbuilder/testdata/select_for_update line 2800 at r1 (raw file):

Did all except scan on a different table. (Is it possible to produce a plan with scans of two different tables without a join on top of the scans?)

Thanks. Yeah, I don't think that's actually possible, good point.


pkg/sql/opt/exec/execbuilder/testdata/select_for_update line 2773 at r3 (raw file):

      spans: /5-/6

# Index join on a different table. In this case we can't push the locking down.

Looks like this one doesn't end up as an index-join.

Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @nvanbenschoten)


pkg/sql/opt/exec/execbuilder/testdata/select_for_update line 2773 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Looks like this one doesn't end up as an index-join.

Hinting the join should make this easier, I think.

Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This :lgtm: once the test is fixed up. If you can't get the optimizer to choose the right plan, I'm fine with skipping that particular case.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @michae2 and @nvanbenschoten)

In some simple cases of SELECT FOR UPDATE using the new Lock operator,
we can elide the locking lookup join and instead push the locking down
into the input.

To optimize more complex cases we'll need to turn locking into a
physical property which is required by the new Lock operator, similar to
ordering.

Fixes: cockroachdb#114566

Release note: None
Add a copy of the tpcc opttest, but with read committed isolation. This
shows the locking differences between plans built for serializable and
plans built for read committed.

This doesn't show the optimization in the previous commit, because it
applies during execbuilder.

Release note: None
Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @nvanbenschoten)

Copy link
Copy Markdown
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

If you can't get the optimizer to choose the right plan, I'm fine with skipping that particular case.

I wasn't able to produce a plan with a lock of table a on top of an index join of table b, so I skipped that for now.

bors r=DrewKimball

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @nvanbenschoten)


pkg/sql/opt/exec/execbuilder/mutation.go line 1173 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Is it safe to push the locks into locality-optimized lookup joins (e.g., when RemoteLookupExpr FiltersExpr is non-nil, or when LocalityOptimized bool is true)? I believe we are free to not lock rows if we can answer the query without scanning them, so I think not locking remote rows in a LOS when we find all the rows locally should be ok, correct?

Yes, I think not locking remote rows in a LOS is totally fine. It's similar to a SFU with a LIMIT: we don't need to (and don't want to) lock more rows after hitting the limit.


pkg/sql/opt/exec/execbuilder/testdata/select_for_update line 2773 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Hinting the join should make this easier, I think.

We don't support INNER INDEX JOIN as a hint (not sure if this is what you meant).

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 6, 2023

Build succeeded:

@mgartner
Copy link
Copy Markdown
Contributor

blathers backport 23.2.0-rc

michae2 added a commit to michae2/cockroach that referenced this pull request Apr 16, 2025
In cockroachdb#114401 we added some simple optimizations to push locking down into
input operations for SELECT FOR UPDATE under Read Committed
isolation. In cockroachdb#116170 we disabled these optimizations for
multi-column-family tables because we weren't sure that all necessary
column families would be locked.

This commit enables the optimizations for multi-column-family tables,
after confirming that all families needing to be locked are fetched by
the input operation.

Informs: cockroachdb#114566, cockroachdb#116838

Release note: None
blathers-crl bot pushed a commit that referenced this pull request Apr 16, 2025
In #114401 we added some simple optimizations to push locking down into
input operations for SELECT FOR UPDATE under Read Committed
isolation. In #116170 we disabled these optimizations for
multi-column-family tables because we weren't sure that all necessary
column families would be locked.

This commit enables the optimizations for multi-column-family tables,
after confirming that all families needing to be locked are fetched by
the input operation.

Informs: #114566, #116838

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-23.2.x PAST MAINTENANCE SUPPORT: 23.2 patch releases via ER request only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: simple SFU queries slower under read committed than serializable

4 participants