sql/server: add function to reset index usage stats#71896
sql/server: add function to reset index usage stats#71896craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
1175760 to
14bf20d
Compare
maryliag
left a comment
There was a problem hiding this comment.
Your reset seems to be deleting some of the columns not the row itself. The Reset should delete everything from the table, meaning your second select should return an empty table and the index value would only show again once they're used.
Reviewed 8 of 13 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @lindseyjin and @maryliag)
pkg/sql/idxusage/local_idx_usage_stats.go, line 278 at r1 (raw file):
} func (s *LocalIndexUsageStats) Reset() {
nit: exported methods need comments
pkg/sql/idxusage/local_idx_usage_stats.go, line 364 at r1 (raw file):
// GetLastReset returns the last time the table was reset. func (s *LocalIndexUsageStats) GetLastReset() time.Time {
I don't see this functions being used anywhere
Azhng
left a comment
There was a problem hiding this comment.
I think this actually should be expected behaivour. This was a conscious design choice we made when index usage stats was initially introduced. We would be showing the index usage statistics for all indexes even if they indexes has never been used at all.
The idea behind that is so that we can easily tell which index is under-utilized (ununsed index use case), since otherwise it would be difficult.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @lindseyjin and @maryliag)
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
pkg/server/index_usage_stats.go, line 120 at r1 (raw file):
ctx = s.AnnotateCtx(ctx) if _, err := s.privilegeChecker.requireViewActivityPermission(ctx); err != nil {
cc: @kevin-v-ngo , what do you think the permission here should be? (forResetSQLStats, we required admin user privilege )
pkg/server/index_usage_stats.go, line 169 at r1 (raw file):
} // It's unfortunate that we cannot use paginatedIterateNodes here because we
I don't think this comment applies here.
pkg/server/tenant_status.go, line 781 at r1 (raw file):
} // We issue a localReq instead of the incoming req to other nodes. This is
ditto
pkg/sql/idxusage/local_idx_usage_stats.go, line 364 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
I don't see this functions being used anywhere
@lindseyjin I think what you can do here is to introduce another field in the IndexUsageStatisticsResponse proto to include this information
pkg/sql/sem/builtins/builtins.go, line 6078 at r1 (raw file):
return tree.MakeDBool(true), nil }, Info: `This function is used to clear the collected index usage metrics.`,
s/metrics/statistics, metrics and statistics are quite different things.
pkg/sql/sem/tree/eval.go, line 3443 at r1 (raw file):
type SQLStatsController interface { ResetClusterSQLStats(ctx context.Context) error ResetIndexUsageStats(ctx context.Context) error
Hmm my feeling is that this probably doesn't belong in this interface, since Index Usage Stats isn't really part of the sql stats. I suppose one thing we can do here is to introduce another interface to avoid circular dependency.
When I originally introduced this, I intended that this would be a 1-to-1 mapping to persistedsqlstats.Controller struct, but it seems like we are on the trajectory to introduce more and more function here. So I think an alternative you can do here would be:
type SQLStatsController interface {
ResetClusterSQLStats(ctx context.Context) error
CreateSQLStatsCompactionSchedule(ctx context.Context) error
}
type IndexUsageStatsController interface {
ResetIndexUsageStats(ctx context.Context) error
}
type ObservabilityController interface {
SQLStatsController
IndexUsageStatsController
}I think overall this would a more intuitive interface hierarchy, though it might be a little bit annoying to implement on the struct side :P
83f8c66 to
dcfe10f
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
pkg/server/index_usage_stats.go, line 169 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
I don't think this comment applies here.
Done, thanks for pointing that out!
pkg/server/tenant_status.go, line 781 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
ditto
Done.
pkg/sql/idxusage/local_idx_usage_stats.go, line 364 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
@lindseyjin I think what you can do here is to introduce another field in the
IndexUsageStatisticsResponseproto to include this information
Done!
pkg/sql/sem/builtins/builtins.go, line 6078 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
s/metrics/statistics, metrics and statistics are quite different things.
Done.
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
pkg/sql/sem/tree/eval.go, line 3443 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Hmm my feeling is that this probably doesn't belong in this interface, since Index Usage Stats isn't really part of the sql stats. I suppose one thing we can do here is to introduce another interface to avoid circular dependency.
When I originally introduced this, I intended that this would be a 1-to-1 mapping to
persistedsqlstats.Controllerstruct, but it seems like we are on the trajectory to introduce more and more function here. So I think an alternative you can do here would be:type SQLStatsController interface { ResetClusterSQLStats(ctx context.Context) error CreateSQLStatsCompactionSchedule(ctx context.Context) error } type IndexUsageStatsController interface { ResetIndexUsageStats(ctx context.Context) error } type ObservabilityController interface { SQLStatsController IndexUsageStatsController }I think overall this would a more intuitive interface hierarchy, though it might be a little bit annoying to implement on the struct side :P
By this, are you suggesting that we replace all direct references to SQLStatsController with the ObservabilityController interface instead?
It seems like SQLStatsController is being used in a lot of parts of the codebase right now; should I be modifying all those references?
maryliag
left a comment
There was a problem hiding this comment.
My thinking here is that if you don't clean the table on reset there is not really a way to remove non longer existing indexes, unless the reset is actually removing non-longer existing indexes, is that the case?
If an index was never used, it won't be on the table, so there is your way of knowing it is under-utilized
Reviewed 5 of 8 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
dcfe10f to
ae651f1
Compare
Azhng
left a comment
There was a problem hiding this comment.
Index usage stats subsystem only store index access on-demand. That means, if index A is never touched by any statements, index usage stats doesn't have any data on index A.
For us to render that table, we actually loop through the descriptor catalog and look up into the index usage stats subsystem, if no stats is found within index usage subsystem, then we render the index read count as 0 and last read timestamp as NULL.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
maryliag
left a comment
There was a problem hiding this comment.
If I have index A, B and C, I collect data for all of them, later I delete index A, do a reset on the table and then use index B.
Since only B has data I imagined that I would have only a row for B with its data and nothing else, but you're saying I would also be seeing C with value null.
What I don't want it's to see A on the table. So, would I be seeing A on the table after the reset?
Reviewed 2 of 8 files at r2, 2 of 2 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
Azhng
left a comment
There was a problem hiding this comment.
What I don't want it's to see A on the table. So, would I be seeing A on the table after the reset?
Since we fetch list of indexes from SQL Schema's catalog system, if A is deleted, then we will not be able to display index A in our table.
So that means if we have a scenario where we have index A B C, we accessed all 3 of them. Then later we deleted index A (without performing index usage stats reset), index A would be removed from the index usage stats table too as well.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
ae651f1 to
c4d1cbe
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
pkg/sql/sem/tree/eval.go, line 3443 at r1 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
By this, are you suggesting that we replace all direct references to SQLStatsController with the ObservabilityController interface instead?
It seems like SQLStatsController is being used in a lot of parts of the codebase right now; should I be modifying all those references?
Added in new IndexUsageStatsController interface, although no ObservabilityController interface, as discussed offline
Azhng
left a comment
There was a problem hiding this comment.
Nice. Seems like you also need to update the bazel build rules to pass CI.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
pkg/sql/idxusage/index_usage_stats_controller.go, line 27 at r4 (raw file):
} // NewController returns a new instance of sqlstats.Controller.
nit: s/sqlstats.Controller/idxusage.Controller/
af03426 to
58f4b6a
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, and @maryliag)
pkg/server/index_usage_stats.go, line 120 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
cc: @kevin-v-ngo , what do you think the permission here should be? (for
ResetSQLStats, we required admin user privilege )
Followed up with Kevin and he said to follow the same rules (admin) to be consistent
lindseyjin
left a comment
There was a problem hiding this comment.
Update: added some unit tests in tenant_status_test.go. Let me know if that looks fine!
Not sure if I also should be adding tests in index_usage_stats_test.go as well? I was having trouble accessing the ResetIndexUsageStats function from there, and it seemed like I would just be testing the same thing again.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, and @maryliag)
58f4b6a to
c5d8bed
Compare
Azhng
left a comment
There was a problem hiding this comment.
Which index_usage_stats_test.go are u referring to? Is it the one in the server pkg or the
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, and @maryliag)
pkg/ccl/serverccl/tenant_status_test.go, line 393 at r7 (raw file):
require.Equal(t, uint64(0), stats.Stats.TotalReadCount) require.Equal(t, time.Time{}, stats.Stats.LastRead) }
Nice!
Let's also test that this statistics is properly isolated across tenant-tenant boundary and the tenant-host boundary? the test helper should give you access to 2 separate tenant clusters (testing cluster vs controlled cluster) as well as a host cluster.
pkg/server/index_usage_stats.go, line 120 at r1 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
Followed up with Kevin and he said to follow the same rules (admin) to be consistent
Sounds good 👍, Let's then change the view activity permission into admin permission here then
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, and @maryliag)
pkg/server/index_usage_stats.go, line 120 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Sounds good 👍, Let's then change the view activity permission into admin permission here then
Oh, I didn't realize that there was a different function used for checking admin user privilege. Both ResetSQLStats and IndexUsageStatistics currently use this line of code. Should I be updating all of them to require admin user permission? Or should I leave it the way it is for consistency?
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, and @maryliag)
pkg/server/index_usage_stats.go, line 120 at r1 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
Oh, I didn't realize that there was a different function used for checking admin user privilege. Both ResetSQLStats and IndexUsageStatistics currently use this line of code. Should I be updating all of them to require admin user permission? Or should I leave it the way it is for consistency?
Good call. It seems like the like RPC handler on the regular status server and the RPC handler on the tenant status server are requiring different permission. Seems like an oversight 🤦, let's change them all to requiring admin user.
c5d8bed to
584bbbc
Compare
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, and @maryliag)
pkg/ccl/serverccl/tenant_status_test.go, line 404 at r8 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
Hi, sorry do you mind clarifying that? I insert data in the control cluster earlier, and I'm asserting here that we'll see non zero stats after resetting the test cluster. Should I be doing something different?
Oh oops, I missed the non-zero part. My bad 😅
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, and @maryliag)
pkg/sql/conn_executor.go, line 678 at r8 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Sorry that I just noticed this right now, since we are already exporting local index stats above via
GetLocalIndexStatistics(), there's no need to forward the call here and below. Just directly use the struct returned fromGetLocalIndexStatistics()and call theReset()andGetLastReset()method on that struct .
Done!
Azhng
left a comment
There was a problem hiding this comment.
🔥 🔥 🔥
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, and @maryliag)
|
bors r+ |
|
bors r- |
|
Canceled. |
maryliag
left a comment
There was a problem hiding this comment.
Reviewed 1 of 8 files at r2, 3 of 8 files at r4, 2 of 2 files at r5, 1 of 3 files at r8, 7 of 9 files at r9, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
pkg/server/index_usage_stats.go, line 120 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Hmm good question, I don't know if there was speicfic reasoning here. When we took over the codebase, the endpoints that were behind the Statement/Transaction Page already has the
VIEW ACTIVITYpermission. Do you know if there's specific requirements for permissions on those pages from Product perspective?Also, I'm not sure if we restrict read access to
crdb_internal.*tables at all 🤔
We have two cases here: read only users and write users.
To view our pages on Console, the user requires just VIEW ACTIVITY, this is because admins give that permission to other people on their team so they can investigate the issues with the database etc, so to view the stats we should keep the permission as view activity
for the reset (index and stats), I think we should think more, do we want non-admin to be able to clear stats? If yes, then we use view activity for all cases, if no, this means we need to add a better handling on the clear stats on the UI, since a user can have the view activity permission, so they will see our pages, but the clear stats won't do anything and we need a way for them to know if that is the case.
d7e4394 to
212a10c
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, and @maryliag)
pkg/server/sql_stats.go, line 30 at r10 (raw file):
ctx = s.AnnotateCtx(ctx) if _, err := s.privilegeChecker.requireViewActivityPermission(ctx); err != nil {
Noticed that the two different functions for ResetSQLStats held different permissions 🤔
Should I be changing this one to View Activity to be consistent, or was there a reason for that?
maryliag
left a comment
There was a problem hiding this comment.
Reviewed all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, and @maryliag)
pkg/server/sql_stats.go, line 30 at r10 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
Noticed that the two different functions for ResetSQLStats held different permissions 🤔
Should I be changing this one to View Activity to be consistent, or was there a reason for that?
They should both be Admin
212a10c to
3cf5535
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, and @maryliag)
pkg/server/sql_stats.go, line 30 at r10 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
They should both be Admin
Ah ok got it, thanks!
maryliag
left a comment
There was a problem hiding this comment.
Reviewed 1 of 3 files at r10, 1 of 2 files at r11, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, and @maryliag)
|
bors r+ |
|
Build failed: |
Previously, there was no existing way to clear index usage statistics from the crdb_internal.index_usage_statistics table. This commit adds a function that enables developers to clear index usage metrics using RPC fanout to reach all nodes in a cluster. We have also added a new metadata field that tracks the last reset time. Currently, this functionality can be accessed via the SQL shell, and is not yet accessible from the frontend console. Release note (sql change): Add function crdb_internal.reset_index_usage_stats() to clear index usage stats. This can be invoked from the SQL shell.
|
Note: not sure why that failed since this test passes locally for me. Going to try re-triggering the pipeline |
3cf5535 to
f85b8ff
Compare
|
bors r+ |
|
Build succeeded: |
|
blathers backport 21.2.x |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from f85b8ff to blathers/backport-release-21.2-71896: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Previously, there was no existing way to clear index usage statistics
from the crdb_internal.index_usage_statistics table. This commit adds a
function that enables developers to clear index usage metrics using RPC
fanout to reach all nodes in a cluster. We have also added a new
metadata field that tracks the last reset time. Currently, this
functionality can be accessed via the SQL shell, and is not yet
accessible from the frontend console.
Release note (sql change): Add function
crdb_internal.reset_index_usage_stats() to clear index usage stats. This
can be invoked from the SQL shell.