ts: add support for multitenant timeseries metrics#98077
ts: add support for multitenant timeseries metrics#98077craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
8280a9f to
abc87f6
Compare
abc87f6 to
7c9378d
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
pkg/ts/server_test.go line 480 at r1 (raw file):
} tenant, _ := serverutils.StartTenant(t, s, base.TestTenantArgs{TenantID: roachpb.MustMakeTenantID(10)})
I believe this is causing the CI failures. Could use some guidance on how to structure this part of the test.
|
@herkolategan could you assist Aditya here? |
abarganier
left a comment
There was a problem hiding this comment.
Nice work - I have some nits but overall I'm happy with how these changes are minimally invasive to the existing code 🙌
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi and @andreimatei)
pkg/ts/query.go line 446 at r1 (raw file):
timespan QueryTimespan, mem QueryMemoryContext, secondarySource string,
nit: can we explain what secondarySource is in the comments, or maybe better, rename the variable? It's not immediately obvious what this variable does. It seems like secondarySource will only ever be a tenant ID, though, so I think it'd make sense to make the name tenant-specific.
I also kind of wonder if this secondary source should be a part of the tspb.Query 🤔. We already have precedent for modifying the request object at the authorizer step by setting the tenant ID at the top level, so maybe we could instead do something similar for each query included in the request. I don't want this idea to turn into a bikeshed, so if that would throw a wrench into things then disregard.
Code quote:
secondarySource string,pkg/ts/query.go line 863 at r1 (raw file):
b.Get(key) } else { // In the case where a secondary source is not specified, we should scan
nit: consider elaborating a bit more in this comment about why we do a Scan instead of a Get.
Code quote:
// In the case where a secondary source is not specified, we should scan
// all keys that prefix match the primary source.
herkolategan
left a comment
There was a problem hiding this comment.
@knz @aadityasondhi I took a stab at solving the failure on CI, but I lack knowledge in a few areas as to why the test is failing. Nonetheless I did get it to pass locally by changing some things:
I noted the failure on CI was due to authentication issues:
E230309 22:34:33.498933 2517 2@rpc/context.go:2335 [T1,n1,rnode=1,raddr=127.0.0.1:34803,class=default,rpc] 609 unable to connect (is the peer up and reachable?): initial connection heartbeat failed: grpc: connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"Cockroach CA\")" [code 14/Unavailable]
To fix that I swapped out to the tenant's RPC for connecting:
tenantConn, err := tenant.(*server.TestTenant).RPCContext().GRPCDialNode(tenant.(*server.TestTenant).Cfg.AdvertiseAddr, tsrv.NodeID(),rpc.DefaultClass).Connect(context.Background())
Next I ran into:
server_test.go:514: rpc error: code = Unauthenticated desc = client tenant does not have capability to query timeseries data
Capabilities can be altered by connecting to the test server:
db.Exec("ALTER TENANT [10] GRANT CAPABILITY can_view_tsdb_metrics=true;\n")
Full very crude solution / guidance:
// StartServer code at start of test
s, db, _ := serverutils.StartServer(t, base.TestServerArgs{
DisableDefaultTestTenant: true,
...
// Later down the line where tenant is started
tenant, _ := serverutils.StartTenant(t, s, base.TestTenantArgs{TenantID: roachpb.MustMakeTenantID(10)})
db.Exec("ALTER TENANT [10] GRANT CAPABILITY can_view_tsdb_metrics=true;\n")
if err != nil {
t.Fatal(err)
}
time.Sleep(5 * time.Second) // I don't know how to flush capabilities, yet, so just wait?
tenantConn, err := tenant.(*server.TestTenant).RPCContext().GRPCDialNode(tenant.(*server.TestTenant).Cfg.AdvertiseAddr, tsrv.NodeID(), rpc.DefaultClass).Connect(context.Background())
if err != nil {
t.Fatal(err)
}
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi and @andreimatei)
herkolategan
left a comment
There was a problem hiding this comment.
Just a follow-up, the time.Sleep is obviously just a placeholder while I browsed around for something better. A search for the function WaitForTenantCapabilities yields a possible better way after altering the capability.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi and @andreimatei)
herkolategan
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi and @andreimatei)
pkg/ts/server_test.go line 480 at r1 (raw file):
Previously, aadityasondhi (Aaditya Sondhi) wrote…
I believe this is causing the CI failures. Could use some guidance on how to structure this part of the test.
Left a comment above with a possible solution.
andreimatei
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi, @abarganier, and @herkolategan)
pkg/server/status/recorder.go line 332 at r1 (raw file):
registry: mr.mu.nodeRegistry, format: nodeTimeSeriesPrefix, source: ts.MakeSource(mr.mu.desc.NodeID.String(), ""),
nit: "" /* secondarySource */
pkg/ts/keys.go line 122 at r1 (raw file):
// If a secondary source is specified, the returned source will be in the format // `[primary]-[secondary]`. func MakeSource(primarySource string, secondarySource string) string {
I would call them nodeID and tenantID. I think their interpretation is pretty ingrained in the different layers, I doubt that it's useful to consider it generic.
pkg/ts/query.go line 446 at r1 (raw file):
Previously, abarganier (Alex Barganier) wrote…
nit: can we explain what
secondarySourceis in the comments, or maybe better, rename the variable? It's not immediately obvious what this variable does. It seems likesecondarySourcewill only ever be a tenant ID, though, so I think it'd make sense to make the name tenant-specific.I also kind of wonder if this secondary source should be a part of the
tspb.Query🤔. We already have precedent for modifying the request object at the authorizer step by setting the tenant ID at the top level, so maybe we could instead do something similar for each query included in the request. I don't want this idea to turn into a bikeshed, so if that would throw a wrench into things then disregard.
+1
pkg/ts/query.go line 544 at r1 (raw file):
mem QueryMemoryContext, dest *[]tspb.TimeSeriesDatapoint, sourceSet map[string]struct{},
Please add a comment about sourceSet and specify that it's an output param.
And then move secondarySource before the output ones.
pkg/ts/query.go line 580 at r1 (raw file):
for source := range sourceSpans { _, source2 := DecodeSource(source) if secondarySource != source2 {
I think we should push the filter as far down as possible - inside readAllSourcesFromDatabase.
pkg/ts/query.go line 857 at r1 (raw file):
for currentTimestamp := startTimestamp; currentTimestamp <= timespan.EndNanos; currentTimestamp += kd { for _, source := range sources { // If a secondary source is specified, only query data for that tenant source.
see, even you can't stay away from talking about tenants :P
pkg/ts/query.go line 863 at r1 (raw file):
b.Get(key) } else { // In the case where a secondary source is not specified, we should scan
s/should//
pkg/ts/query.go line 877 at r1 (raw file):
for _, result := range b.Results { for _, row := range result.Rows { if row.Value == nil {
This is here for the case where we do a Get that doesn't find the key, right? I'd lift this guard up, for clarity (and a bit of efficiency).
for _, result := range b.Results {
if len(result.Rows) == 1 && result.Rows[0].Value == nil {
// This came from a Get...
continue
}
rows = append(rows, result.Rows...)
}
pkg/ts/server.go line 105 at r1 (raw file):
type TenantServer struct { tspb.UnimplementedTimeSeriesServer
please put a comment on this embedding. What part of the service is not implemented?
pkg/ts/tspb/timeseries.proto line 126 at r1 (raw file):
optional int64 sample_nanos = 4 [(gogoproto.nullable) = false]; // tenantID of the tenant requesting the query. optional roachpb.TenantID tenant_id = 5 [(gogoproto.nullable) = false, (gogoproto.customname) = "TenantID"];
I was imagining that we'd put the tenant ID in the Query, as source_tenant_id (and rename the existing source to source_node unless we need to access this RPC through grpcGateway and that makes renaming fields not be backwards compatible). This way the client more explicitly expresses its intent, and the system tenant would be able to query any other tenant it wants.
Did you consider that?
f0454b4 to
f948fb6
Compare
|
Build succeeded: |
In cockroachdb#98077, we introduced tenant scoped metrics into tsdb. The TenantIDs were part of the source field used for NodeIDs previously. Since the intended purpose for the system tenant was to be able to aggregate across all tenant metrics, we introduced a new scan over all the sources for the given Node source. This introduced a bug: When we prefix match the source to scan over its keyspace, it includes all sources that match that prefix. For a 10 node cluster with 3 tenants, this meant fetching timeseries for Node 1 would include: ``` 1 1-2 1-3 10 10-2 10-3 ``` This would then aggregate data across Node 1 and Node 10. If there were no secondary tenants, it would include the soruces: ``` 1 10 ``` This fix separates the get of the initial system tenant metric and scan across all tenants (with a tenant prefix) to only scan the desired source. Now in the same 10 Node cluster with 3 tenants, it will include: ``` 1 1-2 1-3 ``` The scan is now from the beginning of `1-` to its end, ensuring only Node 1's tenants are included and not Node 10 or 100 or 1000. Fixes: cockroachdb#99486 Release note: None
99511: ts: fix tenant metrics scan in tsdb r=aadityasondhi a=aadityasondhi In #98077, we introduced tenant scoped metrics into tsdb. These metric were part of the source field used for NodeIDs previously. Since the intended purpose for the system tenant was to be able to aggregate across all tenant metrics, we introduced a new scan over the entire keyspace for the given Node source. This introduced a bug: When we prefix match the source to scan over its keyspace, it includes all sources that match that prefix. For a 10 node cluster with 3 tenants, this meant fetching timeseries for Node 1 would include: ``` 1 1-2 1-3 10 10-2 10-3 ``` This would then aggregate data across Node 1 and Node 10. If there were no secondary tenants, it would include the sources: ``` 1 10 ``` This fix separates the get of the initial system tenant metric and scan across all tenants (with a tenant prefix) to only scan the desired source. Now in the same 10 Node cluster with 3 tenants, it will include: ``` 1 1-2 1-3 ``` The scan is now from the beginning of `1-` to its end, ensuring only Node 1's tenants are included and not Node 10 or 100 or 1000. Fixes: #99486 Release note: None 99521: logictest: ensure generated ccl build files have the right tags r=andyyang890 a=healthy-pod Release note: None Epic: none Co-authored-by: Aaditya Sondhi <20070511+aadityasondhi@users.noreply.github.com> Co-authored-by: healthy-pod <ahmad@cockroachlabs.com>
In #98077, we introduced tenant scoped metrics into tsdb. The TenantIDs were part of the source field used for NodeIDs previously. Since the intended purpose for the system tenant was to be able to aggregate across all tenant metrics, we introduced a new scan over all the sources for the given Node source. This introduced a bug: When we prefix match the source to scan over its keyspace, it includes all sources that match that prefix. For a 10 node cluster with 3 tenants, this meant fetching timeseries for Node 1 would include: ``` 1 1-2 1-3 10 10-2 10-3 ``` This would then aggregate data across Node 1 and Node 10. If there were no secondary tenants, it would include the soruces: ``` 1 10 ``` This fix separates the get of the initial system tenant metric and scan across all tenants (with a tenant prefix) to only scan the desired source. Now in the same 10 Node cluster with 3 tenants, it will include: ``` 1 1-2 1-3 ``` The scan is now from the beginning of `1-` to its end, ensuring only Node 1's tenants are included and not Node 10 or 100 or 1000. Fixes: #99486 Release note: None
Previously, while we had the ability to show tenant-level store metrics in the `/_status/vars` page, these metrics were never written to tsdb. This is despite the changes in cockroachdb#98077, which did a great job at writing all the metrics in the tenant-specific metric registries, but didn't pull the tenant-specific store metrics out of the store registries. This is because these metrics exist as *child* metrics on the store registry metrics, and we did not previously have logic to individually pick these metrics out of their parent AggGauge/Counter metrics. This patch adds the logic to do so. Now, for each tenant ID that exists in the recorder's `tenantRegistries` map, we will attempt to pick that tenant's individual child metric & values from all the metrics that exist in TenantsStorageMetrics. This will limit the writing of these tenant-level metrics to only happen in deployments where multiple tenants are running in-process, as environments such as serverless clusters are expected to have an empty `tenantRegistries` map in this recorder. Release note: none
Previously, while we had the ability to show tenant-level store metrics in the `/_status/vars` page, these metrics were never written to tsdb. This is despite the changes in cockroachdb#98077, which did a great job at writing all the metrics in the tenant-specific metric registries, but didn't pull the tenant-specific store metrics out of the store registries. This is because these metrics exist as *child* metrics on the store registry metrics, and we did not previously have logic to individually pick these metrics out of their parent AggGauge/Counter metrics. This patch adds the logic to do so. Now, for each tenant ID that exists in the recorder's `tenantRegistries` map, we will attempt to pick that tenant's individual child metric & values from all the metrics that exist in TenantsStorageMetrics. This will limit the writing of these tenant-level metrics to only happen in deployments where multiple tenants are running in-process, as environments such as serverless clusters are expected to have an empty `tenantRegistries` map in this recorder. Release note: none
99860: tsdb: add tenant-level store metrics to tsdb r=aadityasondhi a=abarganier Previously, while we had the ability to show tenant-level store metrics in the `/_status/vars` page, these metrics were never written to tsdb. This is despite the changes in #98077, which did a great job at writing all the metrics in the tenant-specific metric registries, but didn't pull the tenant-specific store metrics out of the store registries. This is because these metrics exist as *child* metrics on the store registry metrics, and we did not previously have logic to individually pick these metrics out of their parent AggGauge/Counter metrics. This patch adds the logic to do so. Now, for each tenant ID that exists in the recorder's `tenantRegistries` map, we will attempt to pick that tenant's individual child metric & values from all the metrics that exist in TenantsStorageMetrics. This will limit the writing of these tenant-level metrics to only happen in deployments where multiple tenants are running in-process, as environments such as serverless clusters are expected to have an empty `tenantRegistries` map in this recorder. This is necessary because if we're going to support multi-tenant tsdb, app tenants should be able to see core storage information about their logical cluster, such as `livebytes` which indicates how much live active data exists for the cluster. Release note: none Fixes: #99228 100550: sql: adjust a couple of memory monitoring tests r=yuzefovich a=yuzefovich This commit adjusts a couple of memory monitoring related tests. `TestAggregatesMonitorMemory` has been rewritten to observe the correct memory monitor via `crdb_internal.node_memory_monitors` virtual table. `TestEvaluatedMemoryIsChecked` is just straight up removed. Initially, this test was expected to verify that builtin functions like `repeat` perform memory accounting of the intermediate result via our memory accounting system. However, that changed long time ago in 2b00f15 and now such builtins estimate their result size and return `errStringTooLarge` error, so the test was no longer verifying what it intended. This commit removes this test since we do verify the behavior introduced in 2b00f15 elsewhere (in the logic tests). Fixes: #79014. Fixes: #100119. Release note: None Co-authored-by: Alex Barganier <abarganier@crlMBP-NR362FWT1GODM0.local> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Previously, while we had the ability to show tenant-level store metrics in the `/_status/vars` page, these metrics were never written to tsdb. This is despite the changes in #98077, which did a great job at writing all the metrics in the tenant-specific metric registries, but didn't pull the tenant-specific store metrics out of the store registries. This is because these metrics exist as *child* metrics on the store registry metrics, and we did not previously have logic to individually pick these metrics out of their parent AggGauge/Counter metrics. This patch adds the logic to do so. Now, for each tenant ID that exists in the recorder's `tenantRegistries` map, we will attempt to pick that tenant's individual child metric & values from all the metrics that exist in TenantsStorageMetrics. This will limit the writing of these tenant-level metrics to only happen in deployments where multiple tenants are running in-process, as environments such as serverless clusters are expected to have an empty `tenantRegistries` map in this recorder. Release note: none
Previously, while we had the ability to show tenant-level store metrics in the `/_status/vars` page, these metrics were never written to tsdb. This is despite the changes in #98077, which did a great job at writing all the metrics in the tenant-specific metric registries, but didn't pull the tenant-specific store metrics out of the store registries. This is because these metrics exist as *child* metrics on the store registry metrics, and we did not previously have logic to individually pick these metrics out of their parent AggGauge/Counter metrics. This patch adds the logic to do so. Now, for each tenant ID that exists in the recorder's `tenantRegistries` map, we will attempt to pick that tenant's individual child metric & values from all the metrics that exist in TenantsStorageMetrics. This will limit the writing of these tenant-level metrics to only happen in deployments where multiple tenants are running in-process, as environments such as serverless clusters are expected to have an empty `tenantRegistries` map in this recorder. Release note: none
Previously, while we had the ability to show tenant-level store metrics in the `/_status/vars` page, these metrics were never written to tsdb. This is despite the changes in #98077, which did a great job at writing all the metrics in the tenant-specific metric registries, but didn't pull the tenant-specific store metrics out of the store registries. This is because these metrics exist as *child* metrics on the store registry metrics, and we did not previously have logic to individually pick these metrics out of their parent AggGauge/Counter metrics. This patch adds the logic to do so. Now, for each tenant ID that exists in the recorder's `tenantRegistries` map, we will attempt to pick that tenant's individual child metric & values from all the metrics that exist in TenantsStorageMetrics. This will limit the writing of these tenant-level metrics to only happen in deployments where multiple tenants are running in-process, as environments such as serverless clusters are expected to have an empty `tenantRegistries` map in this recorder. Release note: none
This patch allows tsdb to support tenant-sperated timeseries metrics.
The changes can be summarized as:
(1) Changes in the write path:
In
pkg/statuswe now store metrics corresponding to non-system tenantswith their tenant IDs. This makes the
sourcecomponent of the tskey includes both node IDs and tenant IDs:
Note: the system tenant does not include the tenant ID to allow for
backwards compatibility.
(2) Changes in the query path:
(2.1) as system tenant:
If the request to ts server comes annotated as system tenant, we
aggregate timeseries across all tenants in that metric name's keyspace.
(2.2) as secondary tenant:
When a secondary tenant queries the server, it is routed through tenant
connect and checked for having tenant capabilites for viewing tsdb data.
During this stage that request is annotated with the Tenant ID. When the
system tenant ts server recevies this query with a Tenant ID set:
across all sources that match the tenant ID.
scan the keyspace for the NodeID and then filter out results that do
not match with the TenantID.
These changes ensure that the system tenant is able to have a view of
metrics across all tenants and application tenants have access to their
own metrics. These changes are all done entirely server-side so no
client changes are needed. The client will be automatically served the
timeseries data it has access to based on its tenant capability.
Fixes: #96438.
Release note (ui change): Timeseries metrics in db concole will show
tenant-specific metrics. For the system tenant, these timeseries will be
aggregated across all tenants. For a secondary tenant, only metrics
belonging to that particular tenant will be shown.