ccl/server/sql: Fix failing reset index usage unit test#72358
ccl/server/sql: Fix failing reset index usage unit test#72358craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
fea3193 to
c1518f1
Compare
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @lindseyjin)
pkg/ccl/serverccl/tenant_status_test.go, line 380 at r1 (raw file):
crdb_internal.index_usage_statistics WHERE table_id = ` + testTableIDStr
note: this is fine-isssh in test, and CheckQuerResults doesn't really allow you to do this. But in general we really try to avoid trivially concatenating SQL query string, this is a recipe for SQL injections which is really really bad. We usually use placeholder here for them.
pkg/ccl/serverccl/tenant_status_test.go, line 422 at r1 (raw file):
require.NotNil(t, rows) for _, row := range rows { require.Equal(t, row[1], "0", "expected total reads for table %s to be reset, but got %s", row[0], row[1])
nit: wrap long lines at 80 characters
pkg/sql/sem/builtins/builtins_test.go, line 472 at r1 (raw file):
} func TestResetIndexUsageStats(t *testing.T) {
nit: let's be a bit more specific on the test name, this tests is really testing execution of the builtin when it's being set up on a remote node via DistSQL. So i'm thinking something along the line of TestResetIndexUsageStatsOnRemoteSQLNode
pkg/sql/sem/builtins/builtins_test.go, line 476 at r1 (raw file):
ctx := context.Background() testCluster := serverutils.StartNewTestCluster(t, 9 /* numNodes */, base.TestClusterArgs{
Let's scale this down to 3 nodes? 9 nodes feels quite wasteful for a test like this
pkg/sql/sem/builtins/builtins_test.go, line 485 at r1 (raw file):
query := ` CREATE TABLE rides ( id INT PRIMARY KEY,
nit: let's unify the indentation here for better readability
pkg/sql/sem/builtins/builtins_test.go, line 493 at r1 (raw file):
name VARCHAR NOT NULL ); INSERT INTO rides (id, city, vehicle_city) VALUES (1, 'Toronto', 'Calgary');
also this tests case is slightly overkill 😛 , all you need is a query that causes a filter to be pushed down into remote node, so let's see if we can simplify this query a bit ?
d1cb5b2 to
4a6e5fd
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng)
pkg/ccl/serverccl/tenant_status_test.go, line 380 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
note: this is fine-isssh in test, and
CheckQuerResultsdoesn't really allow you to do this. But in general we really try to avoid trivially concatenating SQL query string, this is a recipe for SQL injections which is really really bad. We usually use placeholder here for them.
Noted, thanks! 👍
pkg/sql/sem/builtins/builtins_test.go, line 472 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
nit: let's be a bit more specific on the test name, this tests is really testing execution of the builtin when it's being set up on a remote node via DistSQL. So i'm thinking something along the line of
TestResetIndexUsageStatsOnRemoteSQLNode
Done!
pkg/sql/sem/builtins/builtins_test.go, line 476 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Let's scale this down to 3 nodes? 9 nodes feels quite wasteful for a test like this
Done.
pkg/sql/sem/builtins/builtins_test.go, line 493 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
also this tests case is slightly overkill 😛 , all you need is a query that causes a filter to be pushed down into remote node, so let's see if we can simplify this query a bit ?
Hmm I've removed some columns, although this test case won't fail if I don't have data inserted into the tables. Or are there any existing test or internal tables I can use that are already populated?
00ef106 to
22248b5
Compare
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @lindseyjin)
pkg/sql/sem/builtins/builtins_test.go, line 493 at r1 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
Hmm I've removed some columns, although this test case won't fail if I don't have data inserted into the tables. Or are there any existing test or internal tables I can use that are already populated?
Interesting, turned out it was a bit hard to do this if there's very little data 😅
Here's a reliable reprod
CREATE TABLE t (k INT);
INSERT INTO t SELECT generate_series(1, 30);
--- The following ALTER TABLE forces the range to locate on different nodes, which forces the optmizer to plan
--- the DistSQL operation that executes builtins on the remote node
ALTER TABLE t SPLIT AT VALUES (10), (20);
ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 1, 1;
ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 2, 15;
ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 3, 25;
--- the query that will crash because of the bug
SELECT count(*) FROM t WHERE crdb_internal.reset_index_usage_stats();
This results in a query plan looking like this
and reliably crashes:
root@127.0.0.1:26257/movr> SELECT count(*) FROM t WHERE crdb_internal.reset_index_usage_stats();
ERROR: internal error: crdb_internal.reset_index_usage_stats(): index usage stats controller not set
SQLSTATE: XX000
DETAIL: stack trace:
github.com/cockroachdb/cockroach/pkg/sql/sem/builtins/builtins.go:6070: func306()
github.com/cockroachdb/cockroach/pkg/sql/sem/tree/eval.go:4221: Eval()
github.com/cockroachdb/cockroach/pkg/sql/execinfrapb/expr.go:202: RunFilter()
github.com/cockroachdb/cockroach/pkg/sql/execinfrapb/expr.go:191: EvalFilter()
github.com/cockroachdb/cockroach/pkg/sql/rowexec/filterer.go:95: Next()
github.com/cockroachdb/cockroach/pkg/sql/colexec/columnarizer.go:222: Next()
github.com/cockroachdb/cockroach/pkg/sql/colexec/count.go:51: Next()
github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/deselector.go:63: Next()
github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:285: func1()
github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:91: CatchVectorizedRuntimeError()
github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:276: sendBatches()
github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:410: runWithStream()
github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:216: Run()
github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:737: func1()
github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:1331: 1()
runtime/asm_amd64.s:1371: goexit()
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @lindseyjin)
pkg/sql/sem/builtins/builtins_test.go, line 493 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Interesting, turned out it was a bit hard to do this if there's very little data 😅
Here's a reliable reprod
CREATE TABLE t (k INT); INSERT INTO t SELECT generate_series(1, 30); --- The following ALTER TABLE forces the range to locate on different nodes, which forces the optmizer to plan --- the DistSQL operation that executes builtins on the remote node ALTER TABLE t SPLIT AT VALUES (10), (20); ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 1, 1; ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 2, 15; ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 3, 25; --- the query that will crash because of the bug SELECT count(*) FROM t WHERE crdb_internal.reset_index_usage_stats();This results in a query plan looking like this
and reliably crashes:
root@127.0.0.1:26257/movr> SELECT count(*) FROM t WHERE crdb_internal.reset_index_usage_stats(); ERROR: internal error: crdb_internal.reset_index_usage_stats(): index usage stats controller not set SQLSTATE: XX000 DETAIL: stack trace: github.com/cockroachdb/cockroach/pkg/sql/sem/builtins/builtins.go:6070: func306() github.com/cockroachdb/cockroach/pkg/sql/sem/tree/eval.go:4221: Eval() github.com/cockroachdb/cockroach/pkg/sql/execinfrapb/expr.go:202: RunFilter() github.com/cockroachdb/cockroach/pkg/sql/execinfrapb/expr.go:191: EvalFilter() github.com/cockroachdb/cockroach/pkg/sql/rowexec/filterer.go:95: Next() github.com/cockroachdb/cockroach/pkg/sql/colexec/columnarizer.go:222: Next() github.com/cockroachdb/cockroach/pkg/sql/colexec/count.go:51: Next() github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecutils/deselector.go:63: Next() github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:285: func1() github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:91: CatchVectorizedRuntimeError() github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:276: sendBatches() github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:410: runWithStream() github.com/cockroachdb/cockroach/pkg/sql/colflow/colrpc/outbox.go:216: Run() github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:737: func1() github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:1331: 1() runtime/asm_amd64.s:1371: goexit()
This assumes we have a 3-node cluster
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng)
pkg/sql/sem/builtins/builtins_test.go, line 493 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
This assumes we have a 3-node cluster
Ah I see, thanks!
Dd you mean that I replace my query with this one, or was this just an answer to my question? I'm not sure if it's actually any simpler 🤔
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @lindseyjin)
pkg/sql/sem/builtins/builtins_test.go, line 493 at r1 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
Ah I see, thanks!
Dd you mean that I replace my query with this one, or was this just an answer to my question? I'm not sure if it's actually any simpler 🤔
Let's replace it with the new query, because I realized the one we have right now is not a reliable reprod, it depends on the amount of data in the table and table statistics used by the optimizer. (i.e. sometimes when I run this query it won't fail)
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng)
pkg/sql/sem/builtins/builtins_test.go, line 493 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Let's replace it with the new query, because I realized the one we have right now is not a reliable reprod, it depends on the amount of data in the table and table statistics used by the optimizer. (i.e. sometimes when I run this query it won't fail)
Hmm, when I replace the query, the test case doesn't fail for me even when the index usage stats controller isn't being set.
Also, when I try running this on the cli, I get the following error:
ERROR: unimplemented: operation is unsupported in multi-tenancy mode SQLSTATE: 0A000 HINT: You have attempted to use a feature that is not yet implemented. See: https://go.crdb.dev/issue-v/54254/v22.1
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @lindseyjin)
pkg/sql/sem/builtins/builtins_test.go, line 493 at r1 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
Hmm, when I replace the query, the test case doesn't fail for me even when the index usage stats controller isn't being set.
Also, when I try running this on the cli, I get the following error:
ERROR: unimplemented: operation is unsupported in multi-tenancy mode SQLSTATE: 0A000 HINT: You have attempted to use a feature that is not yet implemented. See: https://go.crdb.dev/issue-v/54254/v22.1
Hmm interesting, seems like the new master changed a few things ... let me check
22248b5 to
2fd988c
Compare
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @lindseyjin)
pkg/sql/sem/builtins/builtins_test.go, line 493 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Hmm interesting, seems like the new master changed a few things ... let me check
Found the problem:
CREATE TABLE t (k INT PRIMARY KEY); -- <- `k` here should be the primary key
INSERT INTO t SELECT generate_series(1, 30);
ALTER TABLE t SPLIT AT VALUES (10), (20);
ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 1, 1;
ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 2, 15;
ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 3, 25;
SELECT count(*) FROM t WHERE crdb_internal.reset_index_usage_stats();
Actually this was caused by the fact that I didn't make the k a primary key, so the range split was effectively no-op.
Resolves cockroachdb#72254 Previously, the unit test for resetting index usage statistics was failing. This was happening because of two reasons: 1. The IndexUsageStatsController was not being set properly for DistSQL server configs. 2. Noise from other tests was sometimes populating the index usage statistics table right after it was reset, causing the check for a newly reset table to sometimes randomly fail. To fix these issues, we have implemented the following: 1. Created a test case to reproduce the first error, and correctly populated IndexUsageStatsController for DistSQL server configs. 2. Updated the unit test to query for the specific test table id, to avoid other tables polluting the test space. The test has also been updated to check for a correct LastReset time. Release note: none
2fd988c to
40bb260
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng)
pkg/sql/sem/builtins/builtins_test.go, line 493 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Found the problem:
CREATE TABLE t (k INT PRIMARY KEY); -- <- `k` here should be the primary key INSERT INTO t SELECT generate_series(1, 30); ALTER TABLE t SPLIT AT VALUES (10), (20); ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 1, 1; ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 2, 15; ALTER TABLE t EXPERIMENTAL_RELOCATE LEASE SELECT 3, 25; SELECT count(*) FROM t WHERE crdb_internal.reset_index_usage_stats();Actually this was caused by the fact that I didn't make the
ka primary key, so the range split was effectively no-op.
Done, thanks!
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @Azhng)
|
bors r+ |
|
Build succeeded: |
Resolves #72254
Previously, the unit test for resetting index usage statistics was
failing. This was happening because of two reasons:
server configs.
statistics table right after it was reset, causing the check for a newly
reset table to sometimes randomly fail.
To fix these issues, we have implemented the following:
populated IndexUsageStatsController for DistSQL server configs.
avoid other tables polluting the test space. The test has also been
updated to check for a correct LastReset time.
Release note: none