server: hash tenantID into ClusterID used in SQL exec by dt · Pull Request #74005 · cockroachdb/cockroach

dt · 2021-12-17T23:52:31Z

Prior to this patch, in a multi-tenant deployment the ClusterID value
visible to tenans was the same as the ClusterID for the storage
cluster.

This original design was defective because it prevented the
backup/restore subsystem from distinguishing backups made for
different tenants.

In this commit, we change the ClusterID for tenants to become a new
UUID that is the storage cluster's ClusterID with a hash of the
tenant's ID added to the lower 64 bits. This makes it so that two
tenants on the same storage cluster will observe different values when
calling ClusterID().

Release note (bug fix): Separate tenants now use unique values for the
internal field ClusterID. Previously separate tenants that
were hosted on the same storage cluster would be assigned the same
ClusterID.

Jira issue: CRDB-14750

cockroach-teamcity · 2021-12-17T23:52:37Z

This change is

dt · 2021-12-18T00:04:43Z

I looked at who all uses ClusterID in pkg/sql and it looks like there are really just two classes of use:
a) returning it in crdb_internal debugging/introspection builtins/tables, and
b) passing it to license checks for ccl features (partitioning, MR, etc) since license can be cluster-id scoped.

Case (a) seems ambivalent to what the value it -- it just shows you it.
Case (b) would potentially be interesting if tenants were checking license keys against specific cluster IDs during SQL feature usage calls, but they don't -- they toggle a licensed status once at startup using a separate mechanism and then the checks take a fast path seeing the toggle on (we should maybe do this everywhere, instead of needing to plumb around a cluster id to every check site).

Looking at pkg/ccl, it has a bunch more cases of (b) and then there's backup's case, where we really do need the semantics described here i.e. no two tenants ever have same value.

knz

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)

pkg/server/server_sql.go, line 528 at r1 (raw file):

	// hash of tenant ID achieves this.
	if !codec.ForSystemTenant() {
		clusterIDForSQL = &base.ClusterIDContainer{}

Can you change this mechanism to switch to writing to cfg.ClusterIDContainer, thanks

knz

It also might be worth adding a unit test that checks that for a single host cluster with two tenants, all 3 things (system tenant plus each of the two non-system tenants) see different cluster IDs.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)

knz

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)

pkg/server/server_sql.go, line 528 at r1 (raw file):

Previously, knz (kena) wrote…

Can you change this mechanism to switch to writing to cfg.ClusterIDContainer, thanks

Hm actually, can we talk about this, I don't exactly understand what is going on here.

dt

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @RaduBerinde)

pkg/server/server_sql.go, line 528 at r1 (raw file):

Previously, knz (kena) wrote…

Hm actually, can we talk about this, I don't exactly understand what is going on here.

I expanded the comment a bit as a starting point, since if it isn't clear to someone who talked it through previously from the diff, then I clearly needed to do more for future readers too. Does the expanded comment help at all?

knz

Reviewed all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @knz, and @RaduBerinde)

pkg/server/server_sql.go, line 528 at r1 (raw file):

Previously, dt (David Taylor) wrote…

I expanded the comment a bit as a starting point, since if it isn't clear to someone who talked it through previously from the diff, then I clearly needed to do more for future readers too. Does the expanded comment help at all?

yep that is good, thanks

pkg/server/server_sql.go, line 525 at r2 (raw file):

	// clusterIDForSQL is the cluster ID that SQL processes use.
	//
	// Not that this is separate from, but sometimes pointed at, the ClusterID in

nit: Note

pkg/server/server_sql.go, line 539 at r2 (raw file):

	// interacting with processes on the same cluster or not. For example, when a
	// BACKUP job resumes writing to some external files, it is vital that be able
	// to verify that the data in those files what wriitten by the same cluster,

nit: written

pkg/server/server_sql.go, line 545 at r2 (raw file):

	// Thus for the non-system tenant, this ClusterID is derived from the unique
	// host cluster's ID, but with the TenantID hased into its lower bits to make
	// an ID that is unique to this tenant on this host cluster.

Can you also add:

Note: we are careful not to populate this per-tenant cluster ID into the
ClusterIDContainer referenced to by cfg.rpcContext, which is also the same
one referenced to by cfg.ClusterIDContainer, cfg.idProvider and several other places.
There are two reasons for this. The main reason is that these references to the
host cluster's cluster ID are used in logging and telemetry to ensure that
log/telemetry entries from tenants connected to the same host cluster
are identified as such, using a common cluster ID.

And I'd like to see these various assertions exercised in unit tests:

check that the SQL idea of cluster ID indeed ends up different
check that cluster IDs included in the log entries (esp with output format JSON) remain the same

knz · 2022-01-05T18:57:22Z

@dt: can you file an issue and link it from the PR, then ping @mwang1026 on it? Michael and Jon need it for accounting purposes.

dt

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @knz, and @RaduBerinde)

pkg/server/server_sql.go, line 545 at r2 (raw file):

main reason is that these references to the host cluster's cluster ID are used in logging and telemetry

er, I think the main reason is that that rpcContext is what we use to make RPCs to the host cluster, so it needs to have a matching cluster ID in it. any logging / telemetry usage is secondary.

dt

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @knz, and @RaduBerinde)

pkg/server/server_sql.go, line 528 at r1 (raw file):

Previously, knz (kena) wrote…

yep that is good, thanks

Done.

pkg/server/server_sql.go, line 545 at r2 (raw file):

check that cluster IDs included in the log entries (esp with output format JSON) remain the same

Hm, I think we should log them both, separately. The tenant cluster -- that is, the sql pods -- have their cluster ID, and then they're hosted by/connected to some kv host cluster to provide their storage. We should log both of those, not just the latter.

knz

Reviewed 1 of 2 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)

pkg/server/server_sql.go, line 545 at r2 (raw file):

I think the main reason is that that rpcContext is what we use to make RPCs to the host cluster

Yes you're right. Note I wrote "there are two reasons" but I only wrote one. You can insert the other first :)

I think we should log them both, separately

What you did for the messages at server startup is a-ok 💯

However this comment here (and the reason for me to insist this be documented) is that separately we also log the host cluster ID on every log entry, all the time for the purpose of routing log data to the right collection point. That value needs to be correct and not tweaked. And that's also what deserves being validated in a unit test.

Maybe you can phrase this better than I did in the comment?

knz · 2022-01-18T12:25:38Z

Do you need help with moving this forward?

dt · 2022-01-18T13:49:49Z

Yeah, I had to push this to the back burner for some more urgent bug fixes and backports over the last couple weeks so I haven't gotten around to writing the relevant tests.

One thing to note: this currently does change the cluster ID reported by a pod when it sends its usage telemetry, to the new tenant-unique cluster ID. I'm of two minds on this.

On the one hand, I think this is Good/Ideal/Correct from an ideological point. Tenant A and Tenant B are distinct two distinct clusters from the point of view of using sql features, making tables, running types of queries, etc, and being on the same host cluster or not is irrelevant to that, so when reporting and collecting telemetry on that sql usage, reporting it under and tracking it under their separate cluster IDs, the same as if they were two separate dedicated clusters, seems most correct to me. If you want to know how many clusters have more than 1000 tables, counting up the distinct cluster IDs that reported >1k table count should tell you the right answer regardless of how many were dedicated, serverless on separate hosts, or serverless but on same host. Now of course we might also want to look for trends correlated to host clusters -- does a certain host cluster skew some reported metric or something -- so we'd also want to include new, separate hostClusterID field in from serverless clusters, but from a principled view on what a unique cluster is, I think I like each tenant sending its telemetry using its individual cluster ID as its "cluster".

On the other hand, the existing reporting has already been extended to handle distinct serverless clusters not reporting under distinct cluster IDs, and instead has started treating the pair of <cluster id, tenant id> as a unique identifier for a logical cluster (perhaps the fact we had to go do that was a signal we actually should have been doing something like this PR, to give individual clusters their own IDs?). Thus, a more practical answer is to just have the telemetry reporter grab the host cluster ID when generating reports to continue to conform to the existing expectation. This certainly seems expedient, if maybe less ideologically pure?

knz · 2022-01-18T16:20:11Z

let's ask @thtruo what a good design would be here.

knz · 2022-01-20T11:46:17Z

Discussed with @thtruo .
For telemetry, we're moving away from the crdb-known ClusterID to identify clusters.
Instead, (my understanding is that) the log collection pipeline is adding the control plane's idea of the cluster's identity in the envelope of telemetry messages. We also populate the tenant ID, and there may even be an "org ID" in the future.

If that is correct, then we're in the clear with making the cluster ID work the way you've defined here.

dt · 2022-01-20T14:37:24Z

Thanks for resolving that!

I feel like there's a similar argument to the one I made above to be made about the cluster ID in structured logging messages as well -- that each serverless tenant cluster should emit log messages tagged with its own unique cluster ID as the cluster, rather than the cluster ID of the host cluster it is attached to (which could go in a separate field, or be determined by looking at the high bits), but I see you implied otherwise in earlier comments here. I don't think I feel strongly enough to argue the point though, so I think I'd like to also leave that determination to server folks.

pransudash · 2022-01-28T02:53:09Z

Quick question here for my clarity: Is the proposed solution then to have clusterId be the same as the unique id used across the control plane and in crdb intrusion tables? Or would this hash of clusterId and tenantId result in a new id? If it's the latter, then telemetry logging will not be able to link events to specific clusters.

knz · 2022-01-28T16:19:30Z

I feel like there's a similar argument to the one I made above to be made about the cluster ID in structured logging messages as well -- that each serverless tenant cluster should emit log messages tagged with its own unique cluster ID as the cluster, rather than the cluster ID of the host cluster it is attached to (which could go in a separate field, or be determined by looking at the high bits), but I see you implied otherwise in earlier comments here.

Yes, this flows from the discussion with Tommy, and invalidates my earlier arguments.

This makes me wonder how much we need to preserve the ~~host~~ storage cluster ID in SQL instances at all.

Is the proposed solution then to have clusterId be the same as the unique id used across the control plane and in crdb intrusion tables?

no, the opposite, however...

If it's the latter, then telemetry logging will not be able to link events to specific clusters.

This is mistaken, if my understanding of the logging pipeline is right. @thtruo or @logston should be able to confirm this: I believe that the cluster ID of the underlying storage cluster is added to telemetry messages alongside whichever value crdb has chosen. (This addition is a feature I've requested a few months ago, due to a bug in crdb: it's possible for some logging messages to fail to include a cluster ID when it originates in crdb, so we really appreciate that the log collector re-adds it regardless of what crdb does).

knz · 2022-01-28T16:20:26Z

@dt in light of these findings, how much remains on this PR?

knz

Ok so as per my tweaks to the code:

the fields "storage cluster ID" and "logical cluster ID" are now properly named throughout the code, especially in the locations where both are important side-by-side.
the diagnostic upload via HTTP (not telemetry logging) now uses the storage-level cluster ID in the uuid query parameter, and the per-tenant cluster ID in the (new) logical_uuid query parameter.

I am not sure the PM team actually cares about this (I don't think we do HTTP diagnostics report for multitenant) so I won't make more noise on this topic.

Reviewed 1 of 3 files at r4, 40 of 48 files at r5, 17 of 17 files at r6, 5 of 5 files at r7, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @stevendanna)

Prior to this patch, in a multi-tenant deployment the ClusterID value visible to tenans was the same as the ClusterID for the storage cluster. This original design was defective because it prevented the backup/restore subsystem from distinguishing backups made for different tenants. In this commit, we define a new *logical* ClusterID for tenants to become a new UUID that is the storage cluster's ClusterID with a hash of the tenant's ID added to the lower 64 bits. This makes it so that two tenants on the same storage cluster will observe different values when calling ClusterID(). Release note (bug fix): Separate tenants now use unique values for the internal field ClusterID. Previously separate tenants that were hosted on the same storage cluster would be assigned the same ClusterID.

knz · 2022-04-06T20:24:22Z

all right, this seems good to go now.

bors r+

knz · 2022-04-06T20:24:36Z

@dt I assume we want this backported to 22.1?

craig · 2022-04-06T21:40:32Z

Build succeeded:

GitHub CI (Cockroach)

dt · 2022-04-06T21:40:35Z

Yep, that'd be my vote (the linked issue is marked GA blocker)

dt · 2022-04-06T21:40:56Z

Also, thank you for taking this over!!!

knz · 2022-04-06T21:44:59Z

blathers backport 22.1

blathers-crl · 2022-04-06T21:45:04Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from 0dc2929 to blathers/backport-release-22.1-74005: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1 failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

79622: roachtest: update libpq test r=otan a=rafiss fixes #79570 This allows the query cancel tests to pass. Release note: None 79632: cli: cosmetic change in the output of `start` r=dt a=knz This fixes a cosmetic bug that made its way in the previous change to `reportServerInfo`. (#74005) Release note: None Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com> Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>

dt added the do-not-merge bors won't merge a PR with this label. label Dec 17, 2021

dt requested a review from knz December 17, 2021 23:52

dt marked this pull request as draft December 17, 2021 23:52

knz requested review from a team and RaduBerinde December 21, 2021 17:59

dt marked this pull request as ready for review January 4, 2022 14:28

dt removed the do-not-merge bors won't merge a PR with this label. label Jan 4, 2022

knz reviewed Jan 5, 2022

View reviewed changes

dt force-pushed the cluster-id branch from 2d0ad54 to 197e58f Compare January 5, 2022 15:49

dt commented Jan 5, 2022

View reviewed changes

knz reviewed Jan 5, 2022

View reviewed changes

dt commented Jan 5, 2022

View reviewed changes

dt force-pushed the cluster-id branch from 197e58f to 48d590d Compare January 5, 2022 23:24

dt requested review from a team as code owners January 5, 2022 23:24

knz reviewed Jan 6, 2022

View reviewed changes

dt requested review from a team and stevendanna and removed request for a team April 5, 2022 16:38

knz force-pushed the cluster-id branch 2 times, most recently from 9a3db0c to 49598ac Compare April 5, 2022 17:49

knz approved these changes Apr 5, 2022

View reviewed changes

knz force-pushed the cluster-id branch from 49598ac to 4750db1 Compare April 5, 2022 17:59

otan removed the request for review from a team April 6, 2022 02:09

knz force-pushed the cluster-id branch 4 times, most recently from 00bf2e2 to 6a1853f Compare April 6, 2022 17:48

knz force-pushed the cluster-id branch from 6a1853f to 0dc2929 Compare April 6, 2022 18:00

craig bot merged commit d4daa99 into cockroachdb:master Apr 6, 2022

dt deleted the cluster-id branch April 7, 2022 12:26

knz mentioned this pull request Apr 8, 2022

cli: cosmetic change in the output of start #79632

Merged

knz mentioned this pull request Apr 10, 2022

release-22.1: server: hash tenantID into ClusterID used in SQL exec #79742

Merged

ajwerner mentioned this pull request Apr 20, 2022

cockroach demo --geo-partitioned-replicas --multitenant=true fails with license error (but not with --multitenant=false) #80270

Closed

tbg mentioned this pull request Jun 22, 2022

roachperf: regression around 2022-04-06 #82136

Closed

Conversation

dt commented Dec 17, 2021 • edited by cockroach-jira-scripts Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Dec 17, 2021

Uh oh!

dt commented Dec 18, 2021

Uh oh!

knz left a comment

Choose a reason for hiding this comment

Uh oh!

knz left a comment

Choose a reason for hiding this comment

Uh oh!

knz left a comment

Choose a reason for hiding this comment

Uh oh!

dt left a comment

Choose a reason for hiding this comment

Uh oh!

knz left a comment

Choose a reason for hiding this comment

Uh oh!

knz commented Jan 5, 2022

Uh oh!

dt left a comment

Choose a reason for hiding this comment

Uh oh!

dt left a comment

Choose a reason for hiding this comment

Uh oh!

knz left a comment

Choose a reason for hiding this comment

Uh oh!

knz commented Jan 18, 2022

Uh oh!

dt commented Jan 18, 2022

Uh oh!

knz commented Jan 18, 2022

Uh oh!

knz commented Jan 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dt commented Jan 20, 2022

Uh oh!

pransudash commented Jan 28, 2022

Uh oh!

knz commented Jan 28, 2022

Uh oh!

knz commented Jan 28, 2022

Uh oh!

knz left a comment

Choose a reason for hiding this comment

Uh oh!

knz commented Apr 6, 2022

Uh oh!

knz commented Apr 6, 2022

Uh oh!

craig bot commented Apr 6, 2022

Uh oh!

dt commented Apr 6, 2022

Uh oh!

dt commented Apr 6, 2022

Uh oh!

knz commented Apr 6, 2022

Uh oh!

blathers-crl bot commented Apr 6, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

dt commented Dec 17, 2021 •

edited by cockroach-jira-scripts

Loading

knz commented Jan 20, 2022 •

edited

Loading