base,server,sql: merge node and SQL instance ID containers#73156
base,server,sql: merge node and SQL instance ID containers#73156craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
ee9cb8a to
d44fa97
Compare
d44fa97 to
8dd1d82
Compare
8dd1d82 to
d828b01
Compare
d828b01 to
daefceb
Compare
RaduBerinde
left a comment
There was a problem hiding this comment.
It would be useful to mark (with TODOs) the parts of the API that are meant to be temporary (and not part of the desired end-state).
CC @tbg too
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @knz and @rimadeodhar)
pkg/base/node_id.go, line 26 at r1 (raw file):
) // NodeIDContainer is used to share a single roachpb.NodeID instance between
[nit] NodeID or SQLInstanceID
pkg/base/node_id.go, line 41 at r1 (raw file):
str atomic.Value // sqlInstance is set to true when the node is a SQL instance.
[nit] I'd move this next to nodeID to maintain the size of the struct.
Btw, don't we want to be able to set both atomically? Maybe we should just switch to using negative values for one kind.
pkg/base/node_id.go, line 65 at r1 (raw file):
// // Note that if the container is configured to contain SQL instance IDs, Get still // works: this ensures that the components such as the RPC Context which
The desired end-state is to never convert one type of ID to the other implicitly, correct? If this is temporary, I think there should be a (preferred) GetNodeID() which panics if sqlInstance is true and ConvertToNodeDeprecated() version which doesn't. And there should be a GetSQLInstanceID() which panics if sqlInstance is false.
pkg/base/node_id.go, line 86 at r1 (raw file):
log.Infof(ctx, "ID set to %d", val) } } else if n.sqlInstance != sqlInstance {
This is racy, no? If two goroutines do setInternal(ctx, val, true) in parallel with the same val, that should be allowed I think? One might see the nodeID value before the bool flag is set. Also, accessing the flag without atomics might upset the race detector.
pkg/base/node_id.go, line 193 at r1 (raw file):
// NewSQLIDContainerForNode sets up a SQLIDContainer which serves the underlying // NodeID as the SQL instance ID. func NewSQLIDContainerForNode(nodeID *NodeIDContainer) *SQLIDContainer {
I don't get this API. We are interpreting a NodeIDContainer as a sql ID, but we expressly disallow it from storing a sqlInstance. Feels like it should be the opposite. I (as a user of this API) shouldn't be able to ever legally hold a SQLIDContainer that has a node ID. And I shouldn't be able to alias it to a NodeIDContainer that I can later set to a node ID (and not a sql instance ID).
pkg/base/node_id.go, line 204 at r1 (raw file):
// container will report the SQL instance ID value as NodeID via // its Get() method. func (n *NodeIDContainer) SwitchToSQLIDContainer() *SQLIDContainer {
It looks like this is only used on zero values for the receiver, can't we just make it a constructor?
daefceb to
1706db5
Compare
knz
left a comment
There was a problem hiding this comment.
mark (with TODOs) the parts of the API that are meant to be temporary (and not part of the desired end-state).
I don't see anything temporary here.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rimadeodhar)
pkg/base/node_id.go, line 41 at r1 (raw file):
I'd move this next to nodeID to maintain the size of the struct.
Done.
don't we want to be able to set both atomically?
No. The determination of whether the container will be for a node ID or SQL instance ID is done just once, in a non-concurrent context.
The setting of the ID is done later, in a concurrent context.
pkg/base/node_id.go, line 65 at r1 (raw file):
The desired end-state is to never convert one type of ID to the other implicitly, correct
That's incorrect - this is already the case, not a "designed end-state". We never convert.
What is perhaps unclear (and I'm adding one more sentence to the docstring for it) is that the distinction between a node ID and a SQL instance ID does not matter for certain parts of the code (specifically - everything that has to do with server-to-server RPCs) and we are OK using the type "node ID" even when the value is a SQL instance ID. It would be silly to duplicate that code.
pkg/base/node_id.go, line 86 at r1 (raw file):
It's not racy because the value of n.sqlInstance is not written concurrently with the read accesses.
One might see the nodeID value before the bool flag is set
No. The determination of .sqlInstance is done before the IDs get populated.
(I could add an additional bool to the struct to prove this fact with an assertion, if that would assuage you? But don't you trust the race detector, which is not failing here?)
pkg/base/node_id.go, line 193 at r1 (raw file):
We are interpreting a NodeIDContainer as a sql ID, but we expressly disallow it from storing a sqlInstance
You're misreading the assertion: it's there to ensure we're not redundantly call NewSQLIDContainerForNode, then convert the resulting *SQLIDContainer to *NodeIDContainer, then call NewSQLIDContainerForNode again. This is an API sequencing assertion.
I (as a user of this API) shouldn't be able to ever legally hold a SQLIDContainer that has a node ID.
That sounds right, why do you think you would be able to?
I shouldn't be able to alias it to a NodeIDContainer that I can later set to a node ID (and not a sql instance ID).
That's already enforced, by the check n.sqlInstance == sqlInstance in Set().
It's perhaps hard to understand in the context of this PR, but it might be clarified in the next PR that uses it: #73159
pkg/base/node_id.go, line 204 at r1 (raw file):
It looks like this is only used on zero values for the receiver
correct.
can't we just make it a constructor?
No: the NodeIDContainer is already instantiated and already shared by reference in multiple places by the time this method is called.
(Believe me, I tried: #72912 - it's a bottomless pit of yak shaving).
Instead I found this API is simpler and used advantageously here: #73159.
I appreciate that you may subjectively thing "this is code stink, I wish it was different". I would agree. But I am looking for an incremental improvement here, not a multi-week project. Unless you can prove me this can be simplified by a demonstration that costs both you and I less than an hour of work, I suggest we don't discuss this at this time and perhaps file a followup issue instead.
|
pkg/base/node_id.go, line 65 at r1 (raw file): Previously, knz (kena) wrote…
Then this is the part I strongly disagree with. That allows anybody to easily add code that uses the NodeID as a.. node ID and not a "whatever" ID. We are converting between different go types here. There should be a separate type for the code that doesn't care (eg ServerID) from which it wouldn't be legal to get a node or sql instance ID. There wouldn't be duplication, though we'd need to change the type used in various parts of the code. That all doesn't need to be done now, but we need to first agree on the "vision" for the desired end state, and then we need to mark code that isn't "legal" in that end state as temporary. My strong opinion is that the desired end state is where 1) there is a single piece of "setup" code (which isn't part of the infrastructure like id containers) which knows that for the system tenant the SQL instance ID is equal to the node ID and 2) it is never possible to write code (that doesn't involve explicit casts) which can interpret one type of ID as the other. |
|
pkg/base/node_id.go, line 193 at r1 (raw file): Previously, knz (kena) wrote…
I could create a NodeIDContainer, populate it with a node ID, then call this method. Now I have a SQLIDContainer which holds a node ID (or am I missing something?). Maybe we could assert that nodeID contains the zero value? |
|
pkg/base/node_id.go, line 86 at r1 (raw file): Previously, knz (kena) wrote…
I see, I will look again. Do you think it would make things more clear to change the bool to a tri-state (uninitialized, node, sql instance) and only allow state transitions from uninitialized? |
rimadeodhar
left a comment
There was a problem hiding this comment.
Reviewed 6 of 8 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @knz and @RaduBerinde)
pkg/base/node_id.go, line 65 at r1 (raw file):
Previously, RaduBerinde wrote…
Then this is the part I strongly disagree with. That allows anybody to easily add code that uses the NodeID as a.. node ID and not a "whatever" ID. We are converting between different go types here. There should be a separate type for the code that doesn't care (eg ServerID) from which it wouldn't be legal to get a node or sql instance ID. There wouldn't be duplication, though we'd need to change the type used in various parts of the code.
That all doesn't need to be done now, but we need to first agree on the "vision" for the desired end state, and then we need to mark code that isn't "legal" in that end state as temporary. My strong opinion is that the desired end state is where 1) there is a single piece of "setup" code (which isn't part of the infrastructure like id containers) which knows that for the system tenant the SQL instance ID is equal to the node ID and 2) it is never possible to write code (that doesn't involve explicit casts) which can interpret one type of ID as the other.
+1 to Radu's comment. I think it is important to keep the semantic differences between a node and a SQL instance ID very obvious with the unified API. It would be nice to have something along the lines of GetNodeID() to explicitly fetch node id which will throw if the container is a sql id container and vice versa. And for code that doesn't care either which way, we can have a GetID or something similar to make it very obvious that the caller for this method should be agnostic to the container type.
pkg/base/node_id.go, line 30 at r2 (raw file):
// and getting the value. Once a value is set, the value cannot // change. type NodeIDContainer struct {
Nit: It will be nice to rename it at some point to IDContainer instead of NodeIDContainer since we will be using it to store either SQL instance ID or node ID. Also, nodeID below can be renamed to just id.
knz
left a comment
There was a problem hiding this comment.
All right in light of the new RFC in #73309, let's accept that this PR is an intermediate state of affairs. I have added references to the RFC as comments in the code that needs to be changed.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rimadeodhar)
pkg/base/node_id.go, line 65 at r1 (raw file):
Previously, rimadeodhar (Rima Deodhar) wrote…
+1 to Radu's comment. I think it is important to keep the semantic differences between a node and a SQL instance ID very obvious with the unified API. It would be nice to have something along the lines of
GetNodeID()to explicitly fetch node id which will throw if the container is a sql id container and vice versa. And for code that doesn't care either which way, we can have aGetIDor something similar to make it very obvious that the caller for this method should be agnostic to the container type.
Yes we agreed on this offline. Adding a reference to #73309 .
pkg/base/node_id.go, line 86 at r1 (raw file):
Previously, RaduBerinde wrote…
I see, I will look again. Do you think it would make things more clear to change the bool to a tri-state (uninitialized, node, sql instance) and only allow state transitions from uninitialized?
I could do this, but I'd rather use the encoding you proposed in #73309. Let's do so in a separate PR.
pkg/base/node_id.go, line 193 at r1 (raw file):
Previously, RaduBerinde wrote…
I could create a NodeIDContainer, populate it with a node ID, then call this method. Now I have a SQLIDContainer which holds a node ID (or am I missing something?). Maybe we could assert that nodeID contains the zero value?
Good idea. Done.
pkg/base/node_id.go, line 30 at r2 (raw file):
Previously, rimadeodhar (Rima Deodhar) wrote…
Nit: It will be nice to rename it at some point to
IDContainerinstead ofNodeIDContainersince we will be using it to store either SQL instance ID or node ID. Also, nodeID below can be renamed to justid.
Added a reference to #73309 . We'll do this renaming in htat context.
1706db5 to
fa43497
Compare
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rimadeodhar)
pkg/base/node_id.go, line 193 at r1 (raw file):
Previously, knz (kena) wrote…
Good idea. Done.
Turns out we can't do that - we have multiple tests that manually inject a non-zero value before server initialization.
This commit ensures that the same data structure can store both SQL instance IDs and Node IDs. The decision is made upon instantiation of the container which of the two types of IDs the container stores. This simplifies the code and paves the road to using a single container in servers for the purpose of identifying the server instance in traces and logs. This also revisits the change in cockroachdb#72607 by switching the tracing/logging prefix `sqli` to `nsql` for SQL instance servers. Release note: None
fa43497 to
6d6a241
Compare
Informs #58938.
Based off a comment by Radu here.
This commit ensures that the same data structure can store both SQL
instance IDs and Node IDs. The decision is made upon instantiation of
the container which of the two types of IDs the container stores.
This simplifies the code and paves the road to using a single
container in servers for the purpose of identifying the server
instance in traces and logs.
This also revisits the change in #72607 by switching the
tracing/logging prefix
sqlitonsqlfor SQL instance servers.Release note: None