sql: change system.tenant_usage to use a single consumption column#70778
sql: change system.tenant_usage to use a single consumption column#70778craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
ajwerner
left a comment
There was a problem hiding this comment.
I think this is fine. I do get the drive to avoid the schema changes. The counter-argument to the pain of adding such columns is the relative ease of this change. When we go and do things like this, it does tend to leave me wondering about the experience that our customers have writing sql.
This is going to break the clusters which @darinpp has been prototyping with, no? That's probably fine, just want to call it out.
Reviewed 3 of 7 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @celiala, and @RaduBerinde)
pkg/migration/migrations/tenant_usage.go, line 61 at r1 (raw file):
DROP COLUMN total_sql_pod_cpu_seconds, ADD COLUMN total_consumption BYTES FAMILY "primary"`, schemaExistsFn: hasColumn,
A paranoid part of me might make me feel better if there were some way to compose this hasColumn check with some form of notHasColumn that ensures the other columns are gone. I'm not too worried about it. This framework is a bit brittle for my liking but alas, it's better than nothing.
pkg/sql/catalog/systemschema/system.go, line 591 at r1 (raw file):
-- Current shares value for this instance. instance_shares FLOAT,
nit while you're here: align the indentation of this column and generally within this statement
pkg/sql/catalog/systemschema/system.go, line 591 at r1 (raw file):
-- Current shares value for this instance. instance_shares FLOAT,
would we ever be in the game of protobuf-ing instance state too?
5e130e2 to
20033b9
Compare
RaduBerinde
left a comment
There was a problem hiding this comment.
TFTR!
The migration will reset existing tenant consumption on existing clusters, but shouldn't lead to crashes.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @andy-kimball, and @celiala)
pkg/migration/migrations/tenant_usage.go, line 61 at r1 (raw file):
Previously, ajwerner wrote…
A paranoid part of me might make me feel better if there were some way to compose this
hasColumncheck with some form ofnotHasColumnthat ensures the other columns are gone. I'm not too worried about it. This framework is a bit brittle for my liking but alas, it's better than nothing.
Added a TODO
pkg/sql/catalog/systemschema/system.go, line 591 at r1 (raw file):
Previously, ajwerner wrote…
nit while you're here: align the indentation of this column and generally within this statement
Fixed. There was a mix of tabs and spaces, changed everything to spaces.
pkg/sql/catalog/systemschema/system.go, line 591 at r1 (raw file):
Previously, ajwerner wrote…
would we ever be in the game of protobuf-ing instance state too?
I don't think so..
RaduBerinde
left a comment
There was a problem hiding this comment.
The counter-argument to the pain of adding such columns is the relative ease of this change
The bigger problem is not the change itself but the versioning - we can't add new versions in patch releases (right?) and that would make it impossible to add an RU component to 21.2.x.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @andy-kimball, and @celiala)
|
Should we consider having a strongly-typed "total_ru_usage" column, but a weakly typed protobuf column for the component metrics? Having to crack a protobuf to see how many RUs have been used seems very painful. |
|
Fortunately the |
20033b9 to
cf777f0
Compare
|
Great idea, added a commit: sql: add crdb_internal.tenant_usage_detailsThis commit adds a virtual view which decodes the tenant consumption Release note: None Release justification: Necessary change for supporting changes to the |
cf777f0 to
3230c2f
Compare
045d3b5 to
76202f7
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Reviewed 1 of 7 files at r1, 1 of 2 files at r3, 2 of 5 files at r6, 9 of 11 files at r7, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @andy-kimball, @celiala, @dt, and @RaduBerinde)
pkg/ccl/backupccl/backup_planning_tenant.go, line 37 at r6 (raw file):
tenants.id = tenant_usage.tenant_id AND tenant_usage.instance_id = 0` func tenantMetadataFromRow(row tree.Datums) (descpb.TenantInfoWithUsage, error) {
defensive programming thought: want to assert the number of columns here? I'm not really worried about it but an assertion failure would be better than a panic if it were to happen.
pkg/sql/crdb_internal.go, line 5326 at r7 (raw file):
Quoted 18 lines of code…
CREATE VIEW crdb_internal.tenant_usage_details AS SELECT tenant_id, (j->>'rU')::FLOAT AS total_ru, (j->>'readBytes')::INT AS total_read_bytes, (j->>'readRequests')::INT AS total_read_requests, (j->>'writeBytes')::INT AS total_write_bytes, (j->>'writeRequests')::INT AS total_write_requests, (j->>'sqlPodsCpuSeconds')::FLOAT AS total_sql_pod_seconds FROM ( SELECT tenant_id, crdb_internal.pb_to_json('cockroach.roachpb.TenantConsumption', total_consumption) AS j FROM system.tenant_usage WHERE instance_id = 0 )`,
nit: sqlfmt this again?
CREATE VIEW crdb_internal.tenant_usage_details AS
SELECT
tenant_id,
(j->>'rU')::FLOAT8 AS total_ru,
(j->>'readBytes')::INT8 AS total_read_bytes,
(j->>'readRequests')::INT8 AS total_read_requests,
(j->>'writeBytes')::INT8 AS total_write_bytes,
(j->>'writeRequests')::INT8 AS total_write_requests,
(j->>'sqlPodsCpuSeconds')::FLOAT8 AS total_sql_pod_seconds
FROM
(
SELECT
tenant_id,
crdb_internal.pb_to_json(
'cockroach.roachpb.TenantConsumption',
total_consumption
)
AS j
FROM
system.tenant_usage
WHERE
instance_id = 0
);76202f7 to
4f2fe07
Compare
RaduBerinde
left a comment
There was a problem hiding this comment.
TFTR!
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @andy-kimball, @celiala, and @dt)
pkg/ccl/backupccl/backup_planning_tenant.go, line 37 at r6 (raw file):
Previously, ajwerner wrote…
defensive programming thought: want to assert the number of columns here? I'm not really worried about it but an assertion failure would be better than a panic if it were to happen.
Done.
pkg/sql/crdb_internal.go, line 5326 at r7 (raw file):
Previously, ajwerner wrote…
CREATE VIEW crdb_internal.tenant_usage_details AS SELECT tenant_id, (j->>'rU')::FLOAT AS total_ru, (j->>'readBytes')::INT AS total_read_bytes, (j->>'readRequests')::INT AS total_read_requests, (j->>'writeBytes')::INT AS total_write_bytes, (j->>'writeRequests')::INT AS total_write_requests, (j->>'sqlPodsCpuSeconds')::FLOAT AS total_sql_pod_seconds FROM ( SELECT tenant_id, crdb_internal.pb_to_json('cockroach.roachpb.TenantConsumption', total_consumption) AS j FROM system.tenant_usage WHERE instance_id = 0 )`,nit: sqlfmt this again?
CREATE VIEW crdb_internal.tenant_usage_details AS SELECT tenant_id, (j->>'rU')::FLOAT8 AS total_ru, (j->>'readBytes')::INT8 AS total_read_bytes, (j->>'readRequests')::INT8 AS total_read_requests, (j->>'writeBytes')::INT8 AS total_write_bytes, (j->>'writeRequests')::INT8 AS total_write_requests, (j->>'sqlPodsCpuSeconds')::FLOAT8 AS total_sql_pod_seconds FROM ( SELECT tenant_id, crdb_internal.pb_to_json( 'cockroach.roachpb.TenantConsumption', total_consumption ) AS j FROM system.tenant_usage WHERE instance_id = 0 );
Done. Again mixing tabs with spaces was the problem.
|
Unrelated flake (filed #70839). bors r+ |
|
Build failed (retrying...): |
|
Looks like the generated code is not up to date. bors r- |
|
Canceled. |
How come? The failure looks like a flake in |
|
I'm looking at the CI build on this PR in isolation: https://teamcity.cockroachdb.com/viewLog.html?buildId=3510938&buildTypeId=Cockroach_UnitTests |
And? There is a flake (TestMultiRegionDataDriven/regional_by_table, tracked by #70846) and a timeout.. |
|
Ah, thanks, I missed that! |
There are ongoing discussions about adding more components to the RU
cost. The prospect of having to change the columns of
`system.tenant_usage` each time is daunting.
This commit modifies the table to use a single `total_consumption`
column which encodes the TenantConsumption proto. We can add fields to
the proto in the future without changing the schema.
The values are still readable via SQL using `pb_to_json`:
```
crdb_internal.pb_to_json('cockroach.roachpb.TenantConsumption', total_consumption)
```
Informs cockroachdb#68479.
Release note: None
Release justification: Necessary change for supporting changes to the
cost model for the upcoming Serverless MVP release. This functionality
is only enabled in multi-tenant scenarios and should have no impact on
our dedicated customers.
This commit adds a virtual view which decodes the tenant consumption
details. Example:
```
root@:26257/defaultdb> select * from crdb_internal.tenant_usage_details;
tenant_id | total_ru | total_read_bytes | total_read_requests | total_write_bytes | total_write_requests | total_sql_pod_seconds
------------+-------------------+------------------+---------------------+-------------------+----------------------+------------------------
123 | 9618.311247130436 | 751504 | 4660 | 47485 | 1877 | 4.454030256529996
```
Release note: None
Release justification: Necessary change for supporting changes to the
cost model for the upcoming Serverless MVP release. This functionality
is only enabled in multi-tenant scenarios and should have no impact on
our dedicated customers.
4f2fe07 to
2ad36e6
Compare
|
bors r+ |
|
Build succeeded: |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from dd7cbf2 to blathers/backport-release-21.2-70778: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
jaylim-crl
left a comment
There was a problem hiding this comment.
If I understand correctly, tenant_usage_details is going to be available only to the KV cluster, right? I'd imagine we won't leak data if a regular tenant tries to query that internal table?
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale)
|
Correct. The tenant has their own tenant_usage table which should just be empty (and modifying it would be inconsequential). Ideally, these wouldn't exist at all on the tenant but I think we're missing a bit of infrastructure to make that happen. |
The part of the backup code that retrieves the tenant metadata was updated in cockroachdb#70778. The new code depends on the `system.tenant_usage` table which is not present in 21.1. Consequently, this query fails when running 21.2 code on a cluster that was not finalized to 21.2 version. This change fixes this by adding an alternate variant of the query when the tenant usage table is not present (according to the cluster version). Fixes cockroachdb#72839. Release note (bug fix): Fixes "backup-lookup-tenants: descriptor not found" when doing full cluster backup while upgrading from 21.1 to 21.2.
The part of the backup code that retrieves the tenant metadata was updated in cockroachdb#70778. The new code depends on the `system.tenant_usage` table which is not present in 21.1. Consequently, this query fails when running 21.2 code on a cluster that was not finalized to 21.2 version. This change fixes this by adding an alternate variant of the query when the tenant usage table is not present (according to the cluster version). Fixes cockroachdb#72839. Release note (bug fix): Fixes "backup-lookup-tenants: descriptor not found" when doing full cluster backup while upgrading from 21.1 to 21.2.
The part of the backup code that retrieves the tenant metadata was updated in cockroachdb#70778. The new code depends on the `system.tenant_usage` table which is not present in 21.1. Consequently, this query fails when running 21.2 code on a cluster that was not finalized to 21.2 version. This change fixes this by adding an alternate variant of the query when the tenant usage table is not present (according to the cluster version). Fixes cockroachdb#72839. Release note (bug fix): Fixes "backup-lookup-tenants: descriptor not found" when doing full cluster backup while upgrading from 21.1 to 21.2.
The part of the backup code that retrieves the tenant metadata was updated in #70778. The new code depends on the `system.tenant_usage` table which is not present in 21.1. Consequently, this query fails when running 21.2 code on a cluster that was not finalized to 21.2 version. This change fixes this by adding an alternate variant of the query when the tenant usage table is not present (according to the cluster version). Fixes #72839. Release note (bug fix): Fixes "backup-lookup-tenants: descriptor not found" when doing full cluster backup while upgrading from 21.1 to 21.2.

There are ongoing discussions about adding more components to the RU
cost. The prospect of having to change the columns of
system.tenant_usageeach time is daunting.This commit modifies the table to use a single
total_consumptioncolumn which encodes the TenantConsumption proto. We can add fields to
the proto in the future without changing the schema.
The values are still readable via SQL using
pb_to_json:Informs #68479.
Release note: None
Release justification: Necessary change for supporting changes to the
cost model for the upcoming Serverless MVP release. This functionality
is only enabled in multi-tenant scenarios and should have no impact on
our dedicated customers.