server: hash tenantID into ClusterID used in SQL exec#74005
server: hash tenantID into ClusterID used in SQL exec#74005craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
I looked at who all uses ClusterID in pkg/sql and it looks like there are really just two classes of use: Case (a) seems ambivalent to what the value it -- it just shows you it. Looking at pkg/ccl, it has a bunch more cases of (b) and then there's backup's case, where we really do need the semantics described here i.e. no two tenants ever have same value. |
knz
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)
pkg/server/server_sql.go, line 528 at r1 (raw file):
// hash of tenant ID achieves this. if !codec.ForSystemTenant() { clusterIDForSQL = &base.ClusterIDContainer{}
Can you change this mechanism to switch to writing to cfg.ClusterIDContainer, thanks
knz
left a comment
There was a problem hiding this comment.
It also might be worth adding a unit test that checks that for a single host cluster with two tenants, all 3 things (system tenant plus each of the two non-system tenants) see different cluster IDs.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)
pkg/server/server_sql.go, line 528 at r1 (raw file):
Previously, knz (kena) wrote…
Can you change this mechanism to switch to writing to
cfg.ClusterIDContainer, thanks
Hm actually, can we talk about this, I don't exactly understand what is going on here.
dt
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @knz and @RaduBerinde)
pkg/server/server_sql.go, line 528 at r1 (raw file):
Previously, knz (kena) wrote…
Hm actually, can we talk about this, I don't exactly understand what is going on here.
I expanded the comment a bit as a starting point, since if it isn't clear to someone who talked it through previously from the diff, then I clearly needed to do more for future readers too. Does the expanded comment help at all?
knz
left a comment
There was a problem hiding this comment.
Reviewed all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt, @knz, and @RaduBerinde)
pkg/server/server_sql.go, line 528 at r1 (raw file):
Previously, dt (David Taylor) wrote…
I expanded the comment a bit as a starting point, since if it isn't clear to someone who talked it through previously from the diff, then I clearly needed to do more for future readers too. Does the expanded comment help at all?
yep that is good, thanks
pkg/server/server_sql.go, line 525 at r2 (raw file):
// clusterIDForSQL is the cluster ID that SQL processes use. // // Not that this is separate from, but sometimes pointed at, the ClusterID in
nit: Note
pkg/server/server_sql.go, line 539 at r2 (raw file):
// interacting with processes on the same cluster or not. For example, when a // BACKUP job resumes writing to some external files, it is vital that be able // to verify that the data in those files what wriitten by the same cluster,
nit: written
pkg/server/server_sql.go, line 545 at r2 (raw file):
// Thus for the non-system tenant, this ClusterID is derived from the unique // host cluster's ID, but with the TenantID hased into its lower bits to make // an ID that is unique to this tenant on this host cluster.
Can you also add:
Note: we are careful not to populate this per-tenant cluster ID into the
ClusterIDContainer referenced to by cfg.rpcContext, which is also the same
one referenced to by cfg.ClusterIDContainer, cfg.idProvider and several other places.
There are two reasons for this. The main reason is that these references to the
host cluster's cluster ID are used in logging and telemetry to ensure that
log/telemetry entries from tenants connected to the same host cluster
are identified as such, using a common cluster ID.
And I'd like to see these various assertions exercised in unit tests:
- check that the SQL idea of cluster ID indeed ends up different
- check that cluster IDs included in the log entries (esp with output format JSON) remain the same
|
@dt: can you file an issue and link it from the PR, then ping @mwang1026 on it? Michael and Jon need it for accounting purposes. |
dt
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @knz, and @RaduBerinde)
pkg/server/server_sql.go, line 545 at r2 (raw file):
main reason is that these references to the host cluster's cluster ID are used in logging and telemetry
er, I think the main reason is that that rpcContext is what we use to make RPCs to the host cluster, so it needs to have a matching cluster ID in it. any logging / telemetry usage is secondary.
dt
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @knz, and @RaduBerinde)
pkg/server/server_sql.go, line 528 at r1 (raw file):
Previously, knz (kena) wrote…
yep that is good, thanks
Done.
pkg/server/server_sql.go, line 545 at r2 (raw file):
check that cluster IDs included in the log entries (esp with output format JSON) remain the same
Hm, I think we should log them both, separately. The tenant cluster -- that is, the sql pods -- have their cluster ID, and then they're hosted by/connected to some kv host cluster to provide their storage. We should log both of those, not just the latter.
knz
left a comment
There was a problem hiding this comment.
Reviewed 1 of 2 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)
pkg/server/server_sql.go, line 545 at r2 (raw file):
I think the main reason is that that rpcContext is what we use to make RPCs to the host cluster
Yes you're right. Note I wrote "there are two reasons" but I only wrote one. You can insert the other first :)
I think we should log them both, separately
What you did for the messages at server startup is a-ok 💯
However this comment here (and the reason for me to insist this be documented) is that separately we also log the host cluster ID on every log entry, all the time for the purpose of routing log data to the right collection point. That value needs to be correct and not tweaked. And that's also what deserves being validated in a unit test.
Maybe you can phrase this better than I did in the comment?
|
Do you need help with moving this forward? |
|
Yeah, I had to push this to the back burner for some more urgent bug fixes and backports over the last couple weeks so I haven't gotten around to writing the relevant tests. One thing to note: this currently does change the cluster ID reported by a pod when it sends its usage telemetry, to the new tenant-unique cluster ID. I'm of two minds on this. On the one hand, I think this is Good/Ideal/Correct from an ideological point. Tenant A and Tenant B are distinct two distinct clusters from the point of view of using sql features, making tables, running types of queries, etc, and being on the same host cluster or not is irrelevant to that, so when reporting and collecting telemetry on that sql usage, reporting it under and tracking it under their separate cluster IDs, the same as if they were two separate dedicated clusters, seems most correct to me. If you want to know how many clusters have more than 1000 tables, counting up the distinct cluster IDs that reported >1k table count should tell you the right answer regardless of how many were dedicated, serverless on separate hosts, or serverless but on same host. Now of course we might also want to look for trends correlated to host clusters -- does a certain host cluster skew some reported metric or something -- so we'd also want to include new, separate hostClusterID field in from serverless clusters, but from a principled view on what a unique cluster is, I think I like each tenant sending its telemetry using its individual cluster ID as its "cluster". On the other hand, the existing reporting has already been extended to handle distinct serverless clusters not reporting under distinct cluster IDs, and instead has started treating the pair of |
|
let's ask @thtruo what a good design would be here. |
|
Discussed with @thtruo . If that is correct, then we're in the clear with making the cluster ID work the way you've defined here. |
|
Thanks for resolving that! I feel like there's a similar argument to the one I made above to be made about the cluster ID in structured logging messages as well -- that each serverless tenant cluster should emit log messages tagged with its own unique cluster ID as the cluster, rather than the cluster ID of the host cluster it is attached to (which could go in a separate field, or be determined by looking at the high bits), but I see you implied otherwise in earlier comments here. I don't think I feel strongly enough to argue the point though, so I think I'd like to also leave that determination to server folks. |
|
Quick question here for my clarity: Is the proposed solution then to have clusterId be the same as the unique id used across the control plane and in crdb intrusion tables? Or would this hash of clusterId and tenantId result in a new id? If it's the latter, then telemetry logging will not be able to link events to specific clusters. |
Yes, this flows from the discussion with Tommy, and invalidates my earlier arguments. This makes me wonder how much we need to preserve the
no, the opposite, however...
This is mistaken, if my understanding of the logging pipeline is right. @thtruo or @logston should be able to confirm this: I believe that the cluster ID of the underlying storage cluster is added to telemetry messages alongside whichever value crdb has chosen. (This addition is a feature I've requested a few months ago, due to a bug in crdb: it's possible for some logging messages to fail to include a cluster ID when it originates in crdb, so we really appreciate that the log collector re-adds it regardless of what crdb does). |
|
@dt in light of these findings, how much remains on this PR? |
9a3db0c to
49598ac
Compare
knz
left a comment
There was a problem hiding this comment.
Ok so as per my tweaks to the code:
-
the fields "storage cluster ID" and "logical cluster ID" are now properly named throughout the code, especially in the locations where both are important side-by-side.
-
the diagnostic upload via HTTP (not telemetry logging) now uses the storage-level cluster ID in the
uuidquery parameter, and the per-tenant cluster ID in the (new)logical_uuidquery parameter.I am not sure the PM team actually cares about this (I don't think we do HTTP diagnostics report for multitenant) so I won't make more noise on this topic.
Reviewed 1 of 3 files at r4, 40 of 48 files at r5, 17 of 17 files at r6, 5 of 5 files at r7, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @stevendanna)
00bf2e2 to
6a1853f
Compare
Prior to this patch, in a multi-tenant deployment the ClusterID value visible to tenans was the same as the ClusterID for the storage cluster. This original design was defective because it prevented the backup/restore subsystem from distinguishing backups made for different tenants. In this commit, we define a new *logical* ClusterID for tenants to become a new UUID that is the storage cluster's ClusterID with a hash of the tenant's ID added to the lower 64 bits. This makes it so that two tenants on the same storage cluster will observe different values when calling ClusterID(). Release note (bug fix): Separate tenants now use unique values for the internal field ClusterID. Previously separate tenants that were hosted on the same storage cluster would be assigned the same ClusterID.
|
all right, this seems good to go now. bors r+ |
|
@dt I assume we want this backported to 22.1? |
|
Build succeeded: |
|
Yep, that'd be my vote (the linked issue is marked GA blocker) |
|
Also, thank you for taking this over!!! |
|
blathers backport 22.1 |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 0dc2929 to blathers/backport-release-22.1-74005: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
79622: roachtest: update libpq test r=otan a=rafiss fixes #79570 This allows the query cancel tests to pass. Release note: None 79632: cli: cosmetic change in the output of `start` r=dt a=knz This fixes a cosmetic bug that made its way in the previous change to `reportServerInfo`. (#74005) Release note: None Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com> Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Fixes #74480.
Prior to this patch, in a multi-tenant deployment the ClusterID value
visible to tenans was the same as the ClusterID for the storage
cluster.
This original design was defective because it prevented the
backup/restore subsystem from distinguishing backups made for
different tenants.
In this commit, we change the ClusterID for tenants to become a new
UUID that is the storage cluster's ClusterID with a hash of the
tenant's ID added to the lower 64 bits. This makes it so that two
tenants on the same storage cluster will observe different values when
calling ClusterID().
Release note (bug fix): Separate tenants now use unique values for the
internal field ClusterID. Previously separate tenants that
were hosted on the same storage cluster would be assigned the same
ClusterID.
Jira issue: CRDB-14750