Skip to content

sql/opt: split out row-level locking properties struct#79679

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
nvb:nvanbenschoten/lockingPhysicalProp
Apr 20, 2022
Merged

sql/opt: split out row-level locking properties struct#79679
craig[bot] merged 1 commit intocockroachdb:masterfrom
nvb:nvanbenschoten/lockingPhysicalProp

Conversation

@nvb
Copy link
Copy Markdown
Contributor

@nvb nvb commented Apr 8, 2022

This commit separates an unresolved row-level locking target from a new row-level physical property struct. The former is used to represent a row-level locking clause that is then "resolved" to specific relational operators. Once attached to a specific operator, the locking mode no longer needs to include a target, so it makes sense to represent it as a separate struct.

This addresses @RaduBerinde's comment from #60719.

I think it's correct to call row-level locking a "physical property", but I could be wrong about that.

@nvb nvb requested review from a team and rytaft April 8, 2022 18:16
@nvb nvb requested a review from a team as a code owner April 8, 2022 18:16
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's correct to call row-level locking a "physical property", but I could be wrong about that.

Hmm I see where you're coming from, but as defined currently, I don't think Locking quite meets the definition we use for physical property. For reference:

"Physical properties" are interesting characteristics of an expression that
impact its layout, presentation, or location, but not its logical content.
Examples include row order, column naming, and data distribution (physical
location of data ranges). Physical properties exist outside of the relational
algebra, and arise from both the SQL query itself (e.g. the non-relational
ORDER BY operator) and by the selection of specific implementations during
optimization (e.g. a merge join requires the inputs to be sorted in a
particular order).
Properties can be "required" or "derived". A required property is one specified
by the SQL query text. For example, a DISTINCT clause is a required property on
the set of columns of the corresponding projection -- that the tuple of columns
forms a key (unique values) in the results. A derived property is one derived
by the optimizer for an expression based on the properties of the child
expressions. For example:
SELECT k+1 FROM kv
Once the ordering of "k" is known from kv's descriptor, the same ordering
property can be derived for k+1. During optimization, for each expression with
required properties, the optimizer will look at child expressions to check
whether their actual properties (which can be derived) match the requirement.
If they don't, the optimizer must introduce an "enforcer" operator in the plan
that provides the required property. For example, an ORDER BY clause creates a
required ordering property that can cause the optimizer to add a Sort operator
as an enforcer of that property.

// Required properties are interesting characteristics of an expression that
// impact its layout, presentation, or location, but not its logical content.
// Examples include row order, column naming, and data distribution (physical
// location of data ranges). Physical properties exist outside of the relational
// algebra, and arise from both the SQL query itself (e.g. the non-relational
// ORDER BY operator) and by the selection of specific implementations during
// optimization (e.g. a merge join requires the inputs to be sorted in a
// particular order).
//
// Required properties are derived top-to-bottom - there is a required physical
// property on the root, and each expression can require physical properties on
// one or more of its operands. When an expression is optimized, it is always
// with respect to a particular set of required physical properties. The goal
// is to find the lowest cost expression that provides those properties while
// still remaining logically equivalent.
type Required struct {
// Presentation specifies the naming, membership (including duplicates),
// and order of result columns. If Presentation is not defined, then no
// particular column presentation is required or provided.
Presentation Presentation
// Ordering specifies the sort order of result rows. Rows can be sorted by
// one or more columns, each of which can be sorted in either ascending or
// descending order. If Ordering is not defined, then no particular ordering
// is required or provided.
Ordering props.OrderingChoice
// LimitHint specifies a "soft limit" to the number of result rows that may
// be required of the expression. If requested, an expression will still need
// to return all result rows, but it can be optimized based on the assumption
// that only the hinted number of rows will be needed.
// A LimitHint of 0 indicates "no limit". The LimitHint is an intermediate
// float64 representation, and can be converted to an integer number of rows
// using math.Ceil.
LimitHint float64
// Distribution specifies the physical distribution of result rows. This is
// defined as the set of regions that may contain result rows. If
// Distribution is not defined, then no particular distribution is required.
// Currently, the only operator in a plan tree that has a required
// distribution is the root, since data must always be returned to the gateway
// region.
Distribution Distribution
}

// Provided physical properties of an operator. An operator might be able to
// satisfy a required property in multiple ways, and additional information is
// necessary for execution. For example, the required properties may allow
// multiple ordering choices; the provided properties would describe the
// specific ordering that has to be respected during execution.
//
// Provided properties are derived bottom-up (on the lowest cost tree).
type Provided struct {
// Ordering is an ordering that needs to be maintained on the rows produced by
// this operator in order to satisfy its required ordering. This is useful for
// configuring execution in a distributed setting, where results from multiple
// nodes may need to be merged. A best-effort attempt is made to have as few
// columns as possible.
//
// The ordering, in conjunction with the functional dependencies (in the
// logical properties), must intersect the required ordering.
//
// See the documentation for the opt/ordering package for some examples.
Ordering opt.Ordering
// Distribution is a distribution that needs to be maintained on the rows
// produced by this operator in order to satisfy its required distribution. If
// there is a required distribution, the provided distribution must match it
// exactly.
//
// The provided distribution is not yet used when building the DistSQL plan,
// but eventually it should inform the decision about whether to plan
// processors locally or remotely. Currently, it is used to determine whether
// a Distribute operator is needed between this operator and its parent, which
// can affect the cost of a plan.
Distribution Distribution
}

If we want Locking to be a physical property, it should be added to physical.Provided and/or physical.Required, and set similarly to the other physical properties. I'm not sure this is the right approach.

Also, unlike the other physical properties which may apply to any expression, Locking currently seems pretty specific to scans. If we aren't using it for other expressions, I think it might be better to define it in memo, the same place we define the ScanPrivate.

Out of curiosity, though, could Locking also apply to other expressions that touch tables, namely zigzag join, lookup join, and index join?

Reviewed 36 of 36 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/sql/opt/props/physical/locking.go, line 16 at r1 (raw file):

// Locking represents the row-level locking properties of a relational operator.
// Each relational operator clauses consist of two different row-level locking

nit: Each ... clauses consist -> Each ... clause consists

@rytaft rytaft requested a review from a team April 13, 2022 00:41
@nvb nvb force-pushed the nvanbenschoten/lockingPhysicalProp branch from 5061051 to 7c0ce63 Compare April 13, 2022 22:54
@nvb nvb changed the title sql/opt: split out row-level locking physical property struct sql/opt: split out row-level locking properties struct Apr 13, 2022
@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Apr 13, 2022

Thanks for taking a look Becca!

If we want Locking to be a physical property, it should be added to physical.Provided and/or physical.Required, and set similarly to the other physical properties. I'm not sure this is the right approach.

Also, unlike the other physical properties which may apply to any expression, Locking currently seems pretty specific to scans. If we aren't using it for other expressions, I think it might be better to define it in memo, the same place we define the ScanPrivate.

I don't want Locking to be a physical property unless it meets the definition, but it doesn't sound like it does. I looked into moving the struct to memo, but that introduced a dependency from pkg/sql/opt/exec on pkg/sql/opt/memo, which felt wrong to me. Instead, I moved the struct to pkg/sql/opt, which introduced no new dependencies. Let me know if this feels like the wrong place to put it.

Out of curiosity, though, could Locking also apply to other expressions that touch tables, namely zigzag join, lookup join, and index join?

Yep! That's what we're adding in #60719, which is the PR that motivated this change.

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 10 of 36 files at r1, 27 of 27 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/sql/opt/ops/relational.opt, line 76 at r2 (raw file):

    # locking will be performed while scanning the table. Stronger locking modes
    # are used by SELECT .. FOR [KEY] UPDATE/SHARE statements and by the initial
    # row retrieval of DELETE and UPDATE statements.

nit: briefly mention locking wait policy in the comment

@mgartner
Copy link
Copy Markdown
Contributor

pkg/sql/opt/memo/interner_test.go, line 469 at r2 (raw file):

			{
				val1:  opt.Locking{Strength: tree.ForUpdate, WaitPolicy: tree.LockWaitError},
				val2:  opt.Locking{Strength: tree.ForUpdate, WaitPolicy: tree.LockWaitError},

Can you make these fields unnamed so that the test won't compile if a new field is added? For example: opt.Locking{tree.ForUpdate, tree.LockWairError}.

@nvb nvb force-pushed the nvanbenschoten/lockingPhysicalProp branch from 7c0ce63 to 6ea0a76 Compare April 19, 2022 18:26
Copy link
Copy Markdown
Contributor Author

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @rytaft)


pkg/sql/opt/ops/relational.opt, line 76 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: briefly mention locking wait policy in the comment

Done.


pkg/sql/opt/memo/interner_test.go, line 469 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Can you make these fields unnamed so that the test won't compile if a new field is added? For example: opt.Locking{tree.ForUpdate, tree.LockWairError}.

Done.


pkg/sql/opt/props/physical/locking.go, line 16 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: Each ... clauses consist -> Each ... clause consists

Done.

This commit separates an unresolved row-level locking target from a new
row-level locking properties struct. The former is used to represent a
row-level locking clause that is then "resolved" to specific relational
operators. Once attached to a specific operator, the locking mode no
longer needs to include a target, so it makes sense to represent as a
separate struct.

This addresses Radu's comment from cockroachdb#60719.
@nvb nvb force-pushed the nvanbenschoten/lockingPhysicalProp branch from 6ea0a76 to 2115a44 Compare April 19, 2022 20:03
@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Apr 19, 2022

Can you make these fields unnamed so that the test won't compile if a new field is added? For example: opt.Locking{tree.ForUpdate, tree.LockWairError}.

TestRoachVet did not like this, so I've reverted.

------- Stdout: -------
=== RUN   TestLint/TestRoachVet
    lint_test.go:91:
        pkg/sql/opt/memo/interner_test.go:468:12: github.com/cockroachdb/cockroach/pkg/sql/opt.Locking composite literal uses unkeyed fields
        pkg/sql/opt/memo/interner_test.go:469:12: github.com/cockroachdb/cockroach/pkg/sql/opt.Locking composite literal uses unkeyed fields
    --- FAIL: TestLint/TestRoachVet (205.49s)

@mgartner
Copy link
Copy Markdown
Contributor

TestRoachVet did not like this, so I've reverted.

I don't understand. We already use this pattern in the same file:

val1: WindowFrame{treewindow.RANGE, treewindow.UnboundedPreceding, treewindow.CurrentRow, treewindow.NoExclusion},

@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Apr 20, 2022

I don't understand. We already use this pattern in the same file:

This appears to be because the composites analysis pass allows unkeyed fields for locally defined (i.e. in the same package) struct types. See this condition in https://pkg.go.dev/golang.org/x/tools/go/analysis/passes/composite. If I move WindowFrame to a different package, I get the same lint failure.

It doesn't seem like there's much we can do to fight this, so I'll go ahead and merge, but I'm happy to revisit if you think we should try to do something else to avoid the lint error.

bors r=mgartner

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 20, 2022

Build succeeded:

@craig craig bot merged commit 7ea67fd into cockroachdb:master Apr 20, 2022
@mgartner
Copy link
Copy Markdown
Contributor

Ahh, thanks for getting to the bottom of it. 👍

Maybe we can allow unkeyed fields in test files. I’ll try to see if that’s possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants