Skip to content

xform: avoid locality-optimized scans which must always read remote rows#87350

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
msirek:skip_suboptimal_locality_optimized_scan
Sep 12, 2022
Merged

xform: avoid locality-optimized scans which must always read remote rows#87350
craig[bot] merged 1 commit intocockroachdb:masterfrom
msirek:skip_suboptimal_locality_optimized_scan

Conversation

@msirek
Copy link
Copy Markdown
Contributor

@msirek msirek commented Sep 2, 2022

Previously, for tables with old-style partitioning, which don't use the
new multiregion abstractions, there were no guardrails in place to
prevent 2 cases where locality-optimized scan must always read ranges in
remote regions:

  1. When a scan with no hard limit has a non-unique index constraint
    (could return more than one row per matched index key, not including
    the partitioning key column)
  2. When the max cardinality of a constrained scan is less than the hard
    limit placed on the scan via a LIMIT clause

This was inadequate because locality-optimized scan is usually slower
than distributed scans when reading from remote regions is required. If
we can statically determine reading from remote regions is required,
locality-optimized scan should not even be costed and considered by the
optimizer. Multiregion tables, such as REGIONAL BY ROW tables, don't
encounter this issue because the internal crdb_region partitioning
column is not part of the UNIQUE constraint in that case, for example:

CREATE TABLE regional_by_row_table (
  col1 int,
  col2 bool NOT NULL,
  UNIQUE INDEX idx(col1) -- crdb_region is implicitly the 1st idx col
) LOCALITY REGIONAL BY ROW;

SELECT * FROM regional_by_row_table WHERE col1 = 1;

In the above, we could use LOS and split this into a local scan:
SELECT * FROM regional_by_row_table WHERE crdb_region = 'us' AND col1 = 1;
... and remote scans:

SELECT * FROM regional_by_row_table WHERE crdb_region IN ('ca', 'ap')
          AND col1 = 1;

The max cardinality of the local scan is 1, and the max cardinality of
the original scan is 1, so we know it's possible to fulfill the request
solely with the local scan.

To address this, this patch avoids planning locality-optimized scan for
the two cases listed at the top of the description. The first case is
detected by the local scan of the UNION ALL having a lower max
cardinality than the max cardinality including all constraint spans
(for example, given a pair of columns (part_col, col1), if col1 is a
unique key, then max_cardinality(col1) will equal
max_cardinality(part_col, col1). The second case is detected by a
direct comparison of the hard limit with the max cardinality of the
local scan.

Release note (bug fix): This patch fixes a misused query optimization
involving tables with one or more PARTITION BY clauses and partition
zone constraints which assign region locality to those partitions.
In some cases the optimizer picks a locality-optimized search query
plan which is not truly locality-optimized, and has higher latency than
competing query plans which use distributed scan. Locality-optimized
search is now avoided in cases which are known not to benefit from this
optimization.

Release justification: Low risk fix for suboptimal locality-optimized scan

@msirek msirek requested a review from a team as a code owner September 2, 2022 22:26
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: other than the nit. I'll defer approval to @mgartner since we're at the end of the release cycle/probably want to backport this.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @msirek)


pkg/sql/opt/xform/scan_funcs.go line 204 at r1 (raw file):

		if !grp.Relational().Cardinality.IsZeroOrOne() {
			if scanPrivate.HardLimit == 0 {
				return

[nit] is it possible to do this check in CanMaybeGenerateLocalityOptimizedScan?

Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @mgartner)


pkg/sql/opt/xform/scan_funcs.go line 204 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] is it possible to do this check in CanMaybeGenerateLocalityOptimizedScan?

I originally considered modifying CanMaybeGenerateLocalityOptimizedScan to pass in ScanExpr instead of ScanPrivate, but that seemed like it would make calling this from optgen rules more inconvenient since ScanPrivate is often accessible, but ScanExpr is not. Also, this is a "CanMaybe" function, with some of the checks currently deferred until GenerateLocalityOptimizedScan already (maybe for the same reason, that it's easier to deal with ScanPrivate in optgen. What do you think?

@msirek msirek force-pushed the skip_suboptimal_locality_optimized_scan branch from d32dd2d to 7b427f5 Compare September 3, 2022 18:23
Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mgartner)


pkg/sql/opt/xform/scan_funcs.go line 204 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

I originally considered modifying CanMaybeGenerateLocalityOptimizedScan to pass in ScanExpr instead of ScanPrivate, but that seemed like it would make calling this from optgen rules more inconvenient since ScanPrivate is often accessible, but ScanExpr is not. Also, this is a "CanMaybe" function, with some of the checks currently deferred until GenerateLocalityOptimizedScan already (maybe for the same reason, that it's easier to deal with ScanPrivate in optgen. What do you think?

update: I had to update this check to handle IN list cases which have higher cardinality than 1, so must do the check after the local scan is constructed in GenerateLocalityOptimizedScan. So, I don't think we could move the check to CanMaybeGenerateLocalityOptimizedScan now, unless we're OK with constructing the local scan in two locations.

Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @msirek)


pkg/sql/opt/xform/scan_funcs.go line 204 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

update: I had to update this check to handle IN list cases which have higher cardinality than 1, so must do the check after the local scan is constructed in GenerateLocalityOptimizedScan. So, I don't think we could move the check to CanMaybeGenerateLocalityOptimizedScan now, unless we're OK with constructing the local scan in two locations.

Sounds like it might not be applicable here, but for future reference you should be able to reference the matched expression (not just its group) with (Root) in optgen. The ReorderJoins rule is an example.

Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mgartner)


pkg/sql/opt/xform/scan_funcs.go line 204 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Sounds like it might not be applicable here, but for future reference you should be able to reference the matched expression (not just its group) with (Root) in optgen. The ReorderJoins rule is an example.

Thanks, yeah, I was using (Root) in my old version of the fix.

@msirek msirek force-pushed the skip_suboptimal_locality_optimized_scan branch 2 times, most recently from 8f8ae97 to 8ffcb17 Compare September 4, 2022 17:10
@msirek msirek requested a review from a team September 6, 2022 17:23
@msirek msirek force-pushed the skip_suboptimal_locality_optimized_scan branch from 8ffcb17 to 66c617d Compare September 8, 2022 06:34
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @mgartner, and @msirek)


-- commits line 21 at r3:
Does not adjusting the max cardinality imply that the scan is kinda "unconstrained"?


pkg/ccl/logictestccl/testdata/logic_test/multi_region_query_behavior line 474 at r3 (raw file):

# Regression tests for support issue #1780 #
############################################
skipif config multiregion-9node-3region-3azs-no-los

nit: rather than skipping all new statements if we're in a no LOS config, maybe create a separate file?


pkg/ccl/logictestccl/testdata/logic_test/multi_region_query_behavior line 520 at r3 (raw file):

  constraints = '[+region=ap-southeast-2]';

# The zone config needs time to become available.

nit: rather than sleeping would the same retrying plus purging of the system config cache from the other PR work better? It seems like if a machine is overloaded, this sleep could be insufficient resulting flakes.


pkg/ccl/logictestccl/testdata/logic_test/multi_region_query_behavior line 591 at r3 (raw file):

ALTER TABLE users ADD UNIQUE WITHOUT INDEX (address)

# With a unique constraint on the non-partitioning index key columns, a

nit: should we also add a case when there is no such unique constraint to make sure that LOS is not used?


pkg/sql/opt/xform/scan_funcs.go line 232 at r3 (raw file):

			// unbounded, because we'd have already returned when maxRows >
			// ProductionKVBatchSize.
			if localScan.Relational().Cardinality.Max !=

nit: I'm not familiar with this code, but based on the comment I'd expect to see a "less" or "less or equal" comparison and not "not equal", can you elaborate a bit more on this.

@msirek msirek force-pushed the skip_suboptimal_locality_optimized_scan branch from 66c617d to 00cc033 Compare September 9, 2022 02:58
Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @mgartner, and @yuzefovich)


-- commits line 21 at r3:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Does not adjusting the max cardinality imply that the scan is kinda "unconstrained"?

What I mean is, there are cases where max cardinality is not adjusted when it could be. There is an actual limit on the number of rows that could be returned which is not detected.


pkg/ccl/logictestccl/testdata/logic_test/multi_region_query_behavior line 474 at r3 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: rather than skipping all new statements if we're in a no LOS config, maybe create a separate file?

Done

Code quote:

skipif config multiregion-9node-3region-3azs-no-los

pkg/ccl/logictestccl/testdata/logic_test/multi_region_query_behavior line 520 at r3 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: rather than sleeping would the same retrying plus purging of the system config cache from the other PR work better? It seems like if a machine is overloaded, this sleep could be insufficient resulting flakes.

Added the retries. It might flake until #87391 merges.


pkg/ccl/logictestccl/testdata/logic_test/multi_region_query_behavior line 591 at r3 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: should we also add a case when there is no such unique constraint to make sure that LOS is not used?

That is the first case tested, where the partitioning column is part of the unique constraint, but there is no separate constraint on just address by itself, making this column non-unique.


pkg/sql/opt/xform/scan_funcs.go line 232 at r3 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I'm not familiar with this code, but based on the comment I'd expect to see a "less" or "less or equal" comparison and not "not equal", can you elaborate a bit more on this.

My thought was the max cardinality of the local scan should never be greater than the scan of combined spans, so using the != check would avoid using LOS for this unexpected case. But maybe we don't want to avoid it. If it is another suboptimal LOS, we'd want to find out why max cardinality is incorrectly set. Changed it to <.

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also a bit confused why the customer wants this fix - IIUC we only encountered these cases when reproducing manually, but in the production bundles there were no LIMITs?

Reviewed 4 of 4 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @mgartner, and @msirek)


-- commits line 21 at r3:

Previously, msirek (Mark Sirek) wrote…

What I mean is, there are cases where max cardinality is not adjusted when it could be. There is an actual limit on the number of rows that could be returned which is not detected.

I'm still confused why the new multiregion abstractions don't suffer from these issues - could you please try rephrasing the explanation a bit?

@msirek msirek force-pushed the skip_suboptimal_locality_optimized_scan branch from 00cc033 to 73f1cdf Compare September 9, 2022 07:29
Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locality-optimized search was seen in the statement bundles in https://github.com/cockroachlabs/support/issues/1780. The limit could be implied, by a unique constraint (customer case), or manually imposed via a LIMIT clause.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @mgartner, and @yuzefovich)


-- commits line 21 at r3:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I'm still confused why the new multiregion abstractions don't suffer from these issues - could you please try rephrasing the explanation a bit?

OK, I made it more wordy with concrete examples and description of code behavior.

@msirek msirek force-pushed the skip_suboptimal_locality_optimized_scan branch from 73f1cdf to c2433f6 Compare September 9, 2022 13:56
Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @mgartner, and @yuzefovich)


-- commits line 21 at r3:

Previously, msirek (Mark Sirek) wrote…

OK, I made it more wordy with concrete examples and description of code behavior.

Removed the part about RBR table edge cases. It doesn't look like those tables could have a similar issue.

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be backported to 22.2 branch too. LGTM, but I'll defer to @mgartner for approval.

Reviewed all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mgartner)


-- commits line 21 at r3:

Previously, msirek (Mark Sirek) wrote…

Removed the part about RBR table edge cases. It doesn't look like those tables could have a similar issue.

Thanks, this makes sense.

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 4 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @msirek)


-- commits line 41 at r6:

There is no way to specify the crdb_region column as part of the unique constraint as in the PARTITION BY case.

I don't understand what you mean by this.


pkg/ccl/logictestccl/testdata/logic_test/multi_region_locality_optimized_search line 2 at r6 (raw file):

# tenant-cluster-setting-override-opt: allow-multi-region-abstractions-for-secondary-tenants
# LogicTest: multiregion-9node-3region-3azs

nit: The existing multi_region_query_behavior test file might be a good place for these tests. If not, you might want to follow the convention of suffixing the test file with _query_behavior or _query_plan.


pkg/ccl/logictestccl/testdata/logic_test/multi_region_locality_optimized_search line 23 at r6 (raw file):


statement ok
SET experimental_enable_unique_without_index_constraints = true

nit: I don't think this is necessary - none of the tables below have unique without index constraints.


pkg/ccl/logictestccl/testdata/logic_test/multi_region_locality_optimized_search line 55 at r6 (raw file):

  constraints = '[+region=ap-southeast-2]';

query T retry

nit: description of this test might be useful


pkg/ccl/logictestccl/testdata/logic_test/multi_region_locality_optimized_search line 72 at r6 (raw file):

# With a hard limit <= the max cardinality of the local scan, we should choose
# locality-optimized scan.

nit: remove blank line


pkg/sql/opt/xform/scan_funcs.go line 215 at r6 (raw file):

	localScanPrivate.HardLimit = scanPrivate.HardLimit
	localScan := c.e.f.ConstructScan(localScanPrivate)
	if idxConstraint != nil {

What's the purpose of this idxConstraint != nil check?


pkg/sql/opt/xform/scan_funcs.go line 225 at r6 (raw file):

		} else {
			// If there's no LIMIT clause, and the limit of the UNION ALL created for
			// locality-optimized scan based on max cardinality is greater than the

This sentence is confusing. Maybe simpler to just describe this case as "when the max cardinality of the original scan is greater than the max cardinality of the local scan".


pkg/sql/opt/xform/scan_funcs.go line 233 at r6 (raw file):

			// ProductionKVBatchSize.
			if localScan.Relational().Cardinality.Max <
				grp.Relational().Cardinality.Max && !tabMeta.IgnoreUniqueWithoutIndexKeys {

Can you explain why && !tabMeta.IgnoreUniqueWithoutIndexKeys is necessary?

nit: putting the inequality on one line might be easier to read

@msirek msirek force-pushed the skip_suboptimal_locality_optimized_scan branch from c2433f6 to 608cd00 Compare September 9, 2022 17:16
Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @mgartner, and @yuzefovich)


-- commits line 41 at r6:
I removed the confusing sentence. It's not needed anyway, because there's already the line:

Multiregion tables, such as REGIONAL BY ROW tables, don't
encounter this issue because the internal crdb_region partitioning
column is not part of the UNIQUE constraint in that case,


pkg/ccl/logictestccl/testdata/logic_test/multi_region_locality_optimized_search line 2 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: The existing multi_region_query_behavior test file might be a good place for these tests. If not, you might want to follow the convention of suffixing the test file with _query_behavior or _query_plan.

@yuzefovich already requested this be moved out of multi_region_query_behavior due to the necessity to use skipif config directives. I have added the _query_behavior suffix to the current file name.


pkg/ccl/logictestccl/testdata/logic_test/multi_region_locality_optimized_search line 23 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: I don't think this is necessary - none of the tables below have unique without index constraints.

There is one defined on line 116:

statement ok
ALTER TABLE users ADD UNIQUE WITHOUT INDEX (address)

pkg/ccl/logictestccl/testdata/logic_test/multi_region_locality_optimized_search line 55 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: description of this test might be useful

Added


pkg/ccl/logictestccl/testdata/logic_test/multi_region_locality_optimized_search line 72 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: remove blank line

Done


pkg/sql/opt/xform/scan_funcs.go line 215 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

What's the purpose of this idxConstraint != nil check?

When the scan is unconstrained, then Cardinality.Max should not have been decreased, except in cases where we calculate cardinality from the data type. But that case is not so interesting. Maybe the first column could have 3 regions, and the 2nd column a boolean column with 2 values for a total of 6 values. If the 2nd column is INT2, we'd have 64K values times the number of regions, and exceed the 100K row limit:

	// We can only generate a locality optimized scan if we know there is a hard
	// upper bound on the number of rows produced by the local spans. We use the
	// kv batch size as the limit, since it's probably better to use DistSQL once
	// we're scanning multiple batches.
	// TODO(rytaft): Revisit this when we have a more accurate cost model for data
	// distribution.
	maxRows := rowinfra.KeyLimit(grp.Relational().Cardinality.Max)
	if maxRows > rowinfra.ProductionKVBatchSize {
		return
	}

pkg/sql/opt/xform/scan_funcs.go line 225 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

This sentence is confusing. Maybe simpler to just describe this case as "when the max cardinality of the original scan is greater than the max cardinality of the local scan".

Modified


pkg/sql/opt/xform/scan_funcs.go line 233 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Can you explain why && !tabMeta.IgnoreUniqueWithoutIndexKeys is necessary?

nit: putting the inequality on one line might be easier to read

Added the following comment:

			// IgnoreUniqueWithoutIndexKeys is true when we're performing a scan
			// during an insert to verify there are no duplicates violating the
			// uniqueness constraint. This could cause the check below to return, but
			// by design we want to use locality-optimized search for these duplicate
			// checks. So avoid returning if that flag is set.

Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @mgartner, and @yuzefovich)


pkg/sql/opt/xform/scan_funcs.go line 215 at r6 (raw file):

Previously, msirek (Mark Sirek) wrote…

When the scan is unconstrained, then Cardinality.Max should not have been decreased, except in cases where we calculate cardinality from the data type. But that case is not so interesting. Maybe the first column could have 3 regions, and the 2nd column a boolean column with 2 values for a total of 6 values. If the 2nd column is INT2, we'd have 64K values times the number of regions, and exceed the 100K row limit:

	// We can only generate a locality optimized scan if we know there is a hard
	// upper bound on the number of rows produced by the local spans. We use the
	// kv batch size as the limit, since it's probably better to use DistSQL once
	// we're scanning multiple batches.
	// TODO(rytaft): Revisit this when we have a more accurate cost model for data
	// distribution.
	maxRows := rowinfra.KeyLimit(grp.Relational().Cardinality.Max)
	if maxRows > rowinfra.ProductionKVBatchSize {
		return
	}

I also tried a test case limiting the number of values in a column, but the unconstrained scan isn't getting max cardinality adjusted. Instead it's picking up the max cardinality from the LIMIT:

statement OK
ALTER TABLE users ADD CONSTRAINT check_account_id CHECK (account_id IN (1, 2, 3, 4, 5));

query T
EXPLAIN SELECT 1 FROM users@users_pkey LIMIT 33
----

... so the added checks would have no effect in the idxConstraint == nil case.

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r7, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @msirek)


pkg/sql/opt/xform/scan_funcs.go line 215 at r6 (raw file):

Previously, msirek (Mark Sirek) wrote…

I also tried a test case limiting the number of values in a column, but the unconstrained scan isn't getting max cardinality adjusted. Instead it's picking up the max cardinality from the LIMIT:

statement OK
ALTER TABLE users ADD CONSTRAINT check_account_id CHECK (account_id IN (1, 2, 3, 4, 5));

query T
EXPLAIN SELECT 1 FROM users@users_pkey LIMIT 33
----

... so the added checks would have no effect in the idxConstraint == nil case.

At this point in the code, idxConstraint must be non-nil AFAICT, so I don't think the check is necessary.


pkg/sql/opt/xform/scan_funcs.go line 233 at r6 (raw file):

Previously, msirek (Mark Sirek) wrote…

Added the following comment:

			// IgnoreUniqueWithoutIndexKeys is true when we're performing a scan
			// during an insert to verify there are no duplicates violating the
			// uniqueness constraint. This could cause the check below to return, but
			// by design we want to use locality-optimized search for these duplicate
			// checks. So avoid returning if that flag is set.

Thanks for the explanation!


pkg/ccl/logictestccl/testdata/logic_test/multi_region_locality_optimized_search line 2 at r6 (raw file):

Previously, msirek (Mark Sirek) wrote…

@yuzefovich already requested this be moved out of multi_region_query_behavior due to the necessity to use skipif config directives. I have added the _query_behavior suffix to the current file name.

👍


pkg/ccl/logictestccl/testdata/logic_test/multi_region_locality_optimized_search line 23 at r6 (raw file):

Previously, msirek (Mark Sirek) wrote…

There is one defined on line 116:

statement ok
ALTER TABLE users ADD UNIQUE WITHOUT INDEX (address)

Oops, sorry I missed that.

Previously, for tables with old-style partitioning, which don't use the
new multiregion abstractions, there were no guardrails in place to
prevent 2 cases where locality-optimized scan must always read ranges in
remote regions:

1. When a scan with no hard limit has a non-unique index constraint
   (could return more than one row per matched index key, not including
   the partitioning key column)
2. When the max cardinality of a constrained scan is less than the hard
   limit placed on the scan via a LIMIT clause

This was inadequate because locality-optimized scan is usually slower
than distributed scans when reading from remote regions is required. If
we can statically determine reading from remote regions is required,
locality-optimized scan should not even be costed and considered by the
optimizer. Multiregion tables, such as REGIONAL BY ROW tables, don't
encounter this issue because the internal `crdb_region` partitioning
column is not part of the UNIQUE constraint in that case, for example:
```
CREATE TABLE regional_by_row_table (
  col1 int,
  col2 bool NOT NULL,
  UNIQUE INDEX idx(col1) -- crdb_region is implicitly the 1st idx col
) LOCALITY REGIONAL BY ROW;

SELECT * FROM regional_by_row_table WHERE col1 = 1;
```
In the above, we could use LOS and split this into a local scan:
`SELECT * FROM regional_by_row_table WHERE crdb_region = 'us' AND col1 = 1;`
... and remote scans:
```
SELECT * FROM regional_by_row_table WHERE crdb_region IN ('ca', 'ap')
          AND col1 = 1;
```
The max cardinality of the local scan is 1, and the max cardinality of
the original scan is 1, so we know it's possible to fulfill the request
solely with the local scan.

To address this, this patch avoids planning locality-optimized scan for
the two cases listed at the top of the description. The first case is
detected by the local scan of the UNION ALL having a lower max
cardinality than the max cardinality including all constraint spans
(for example, given a pair of columns (part_col, col1), if col1 is a
unique key, then max_cardinality(col1) will equal
max_cardinality(part_col, col1). The second case is detected by a
direct comparison of the hard limit with the max cardinality of the
local scan.

Release note (bug fix): This patch fixes a misused query optimization
involving tables with one or more PARTITION BY clauses and partition
zone constraints which assign region locality to those partitions.
In some cases the optimizer picks a `locality-optimized search` query
plan which is not truly locality-optimized, and has higher latency than
competing query plans which use distributed scan. Locality-optimized
search is now avoided in cases which are known not to benefit from this
optimization.

Release justification: Low risk fix for suboptimal locality-optimized scan
@msirek msirek force-pushed the skip_suboptimal_locality_optimized_scan branch from 608cd00 to 704cf05 Compare September 12, 2022 16:15
Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!
bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball and @mgartner)


pkg/sql/opt/xform/scan_funcs.go line 215 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

At this point in the code, idxConstraint must be non-nil AFAICT, so I don't think the check is necessary.

Missed that. Removed the unnecessary check.

Code quote:

If the local scan could never reach the hard limit, we will al

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Sep 12, 2022

Build succeeded:

@rytaft
Copy link
Copy Markdown
Collaborator

rytaft commented Sep 27, 2022

pkg/sql/opt/xform/scan_funcs.go line 229 at r8 (raw file):

  // by design we want to use locality-optimized search for these duplicate
  // checks. So avoid returning if that flag is set.

Sorry for the super late drive by, but I don't think this is true. We should never be using locality optimized search for these duplicate checks....

Also, I'm having a bit of trouble seeing when we'll ever need this else block. Do any of your test cases cover this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants