Skip to content

opt: extend SplitScanIntoUnionScans to handle longer keys#58085

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
DrewKimball:split_scan
Dec 29, 2020
Merged

opt: extend SplitScanIntoUnionScans to handle longer keys#58085
craig[bot] merged 1 commit intocockroachdb:masterfrom
DrewKimball:split_scan

Conversation

@DrewKimball
Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball commented Dec 18, 2020

Previously, SplitScanIntoUnionScans only handled Scan constraints
where all the start and end keys of the spans were of the exact same
length keyLength, and the first keyLength columns of the index
formed a prefix for the limit ordering columns. The following is an
example of a case that can be handled under these conditions:

/1/2/3: [/US_WEST/0 - /US_WEST/1]
=>
/1/2/3: [/US_WEST/0 - /US_WEST/0], [/US_WEST/1 - /US_WEST/1]
keyLength: 2
limit ordering: +3

However, a case like this one could not be handled, even though
the transformation shown is a valid (and desirable) one:

/1/2/3: [/US_WEST/0/postfix - /US_WEST/1]
=>
/1/2/3: [/US_WEST/0/postfix - /US_WEST/0], [/US_WEST/1 - /US_WEST/1]
keyLength: -
limit ordering: +3

This patch extends SplitScanIntoUnionScans to handle cases like the
above. This is accomplished by modifying the keyCount() and span.split()
code to require a prefix length as input. Examples:

[/ASIA/0/postfix - /ASIA/4].keyCount(prefixLength=1) = 1
[/ASIA/0/postfix - /ASIA/4].keyCount(prefixLength=2) = 5
[/ASIA/0/postfix - /ASIA/4].keyCount(prefixLength=3) = undefined

In this case, the prefix length used is the number of index columns that
precede the limit ordering columns (which must either be in order or in
reverse order):

index: /1/2/-3/4
limit ordering: /3/-4
prefix length: 2

In addition, SplitScanIntoUnionScans can now support cases where one span
from the Scan's constraint can be split and used to generate limited Scans,
but the other cannot.

Fixes #55156

Release note: None

@DrewKimball DrewKimball requested a review from a team as a code owner December 18, 2020 23:31
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome back, @DrewKimball!

This is great! :lgtm:

In the PR/commit message, you say that the following transformation is valid:

/1/2/3: [/US_WEST/0/postfix - /US_WEST/1]
=>
/1/2/3: [/US_WEST/0/postfix - /US_WEST/0], [/US_WEST/1/postfix - /US_WEST/1]

I'm not sure that's technically correct, since the first constraint contains more keys than the second one.

But looking at the code and comments, it looks like you would actually perform this transformation:

/1/2/3: [/US_WEST/0/postfix - /US_WEST/1]
=>
/1/2/3: [/US_WEST/0/postfix - /US_WEST/0], [/US_WEST/1 - /US_WEST/1]

So maybe you just need to update the PR/commit message?

Reviewed 7 of 7 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)


pkg/sql/opt/constraint/span.go, line 335 at r1 (raw file):

// removed:
//
//    ['ASIA'/1 - /'ASIA'/2]/KeyCount(keyCtx, length=2) => 2, true

[nit] /KeyCount -> .KeyCount


pkg/sql/opt/xform/limit_funcs.go, line 162 at r1 (raw file):

// SplitScanIntoUnionScans returns a Union of Scan operators with hard limits
// that each Scan over a single key from the original scan's constraints. This

[nit] each Scan -> each scan ... original scan's -> original Scan's

Copy link
Copy Markdown
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rytaft, it's good to be (sort of) back. And you're right, I'll change the commit message.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft)


pkg/sql/opt/constraint/span.go, line 335 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] /KeyCount -> .KeyCount

Done.


pkg/sql/opt/xform/limit_funcs.go, line 162 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] each Scan -> each scan ... original scan's -> original Scan's

Done.

@DrewKimball DrewKimball force-pushed the split_scan branch 2 times, most recently from d3325ff to e28b372 Compare December 22, 2020 09:09
@DrewKimball
Copy link
Copy Markdown
Collaborator Author

TFTR!

Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

[super nit] in the commit/PR message you're using +3 for the ordering at the top, and /3 for the ordering further down

Also, don't forget to update the PR message to include the fixes you made to the commit message!

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the example in #55156 with this change, but I see no change. Here's the execbuilder test I used: https://gist.github.com/RaduBerinde/0a5286cc3352944726ee588ba7650fc5

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, we don't even need to split any of the spans.. We just need to separate them in their own scans, so that each can be limited.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

@DrewKimball
Copy link
Copy Markdown
Collaborator Author

@RaduBerinde that's because the logic at line 233 in limit_funcs prevents the plan from being generated when the new plan isn't expected to decrease the number of rows scanned by a decent margin. Do you think we should be generating the plan in this case?

If you just want to see the plan being generated from those constraints, you can just increase the rowcount of the table.

@RaduBerinde
Copy link
Copy Markdown
Member

Ah, that's great! Could you add an execbuilder or xform test with the case from the issue? You can inject stats as needed.

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Dec 26, 2020

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added the O-community Originated from the community label Dec 26, 2020
@DrewKimball
Copy link
Copy Markdown
Collaborator Author

@RaduBerinde any chance you could bors this? I'm just an external contributor right now:)

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball)


pkg/sql/opt/exec/execbuilder/testdata/limit, line 393 at r3 (raw file):

# Expect a scan over each shard.
query T
EXPLAIN (OPT, VERBOSE)

[nit] Could you make this a regular EXPLAIN? (the tests are meant to cover what the execbuilder does as well) It doesn't have to be verbose, we're just looking for some unions.

Previously, `SplitScanIntoUnionScans` only handled Scan constraints
where all the start and end keys of the spans were of the exact same
length `keyLength`, and the first `keyLength` columns of the index
formed a prefix for the limit ordering columns. The following is an
example of a case that can be handled under these conditions:
```
/1/2/3: [/US_WEST/0 - /US_WEST/1]
=>
/1/2/3: [/US_WEST/0 - /US_WEST/0], [/US_WEST/1 - /US_WEST/1]
keyLength: 2
limit ordering: +3
```
However, a case like this one could not be handled, even though
the transformation shown is a valid (and desirable) one:
```
/1/2/3: [/US_WEST/0/postfix - /US_WEST/1]
=>
/1/2/3: [/US_WEST/0/postfix - /US_WEST/0], [/US_WEST/1 - /US_WEST/1]
keyLength: -
limit ordering: +3
```
This patch extends `SplitScanIntoUnionScans` to handle cases like the
above. This is accomplished by modifying the `keyCount()` and `span.split()`
code to require a prefix length as input. Examples:
```
[/ASIA/0/postfix - /ASIA/4].keyCount(prefixLength=1) = 1
[/ASIA/0/postfix - /ASIA/4].keyCount(prefixLength=2) = 5
[/ASIA/0/postfix - /ASIA/4].keyCount(prefixLength=3) = undefined
```
In this case, the prefix length used is the number of index columns that
precede the limit ordering columns (which must either be in order or in
reverse order):
```
index: /1/2/-3/4
limit ordering: /3/-4
prefix length: 2
```
In addition, `SplitScanIntoUnionScans` can now support cases where one span
from the Scan's constraint can be split and used to generate limited Scans,
but the other cannot.

Fixes cockroachdb#55156

Release note: None
@DrewKimball
Copy link
Copy Markdown
Collaborator Author

Done.

@RaduBerinde
Copy link
Copy Markdown
Member

Thanks!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 29, 2020

Build succeeded:

@craig craig bot merged commit 6f63d85 into cockroachdb:master Dec 29, 2020
@DrewKimball DrewKimball deleted the split_scan branch December 29, 2020 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

O-community Originated from the community

Projects

None yet

Development

Successfully merging this pull request may close these issues.

opt: extend SplitScanIntoUnionScans to support splitting a constraint with multi-key spans

4 participants