rfc: RFC proposing the logical merger of the client.Txn and TxnCoordSender#16000
rfc: RFC proposing the logical merger of the client.Txn and TxnCoordSender#16000andreimatei wants to merge 1 commit intocockroachdb:masterfrom
Conversation
|
General comment: the lack of wrapping here makes it pretty hard to read in e.g. reviewable. Looks good to me, though I'm not sure what the point of this RFC is - this doesn't seem like a contentious issue, and the RFC is mostly laying out the problem rather than proposing a specific solution. Reviewed 1 of 1 files at r1. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 65 at r1 (raw file):
I don't understand why this is a problem - this client will become its own transaction coordinator. Earlier in the text you mention that we didn't do this in the KV era because we'd have to maintain many implementations (one per language) but we already have the Go implementation (which will subsume the coordinator), and it seems OK to use it, even outside of Cockroach, if we decide to permit it (though I'm not advocating for that here, of course). Comments from Reviewable |
db6f5e3 to
49f9821
Compare
|
wrapped everything. The point is that @bdarnell in the past showed reluctance to uniting these two layers. This is to cover my ass. Review status: 0 of 1 files reviewed at latest revision, 1 unresolved discussion. Comments from Reviewable |
|
Reviewed 1 of 1 files at r2. Comments from Reviewable |
|
I'm in favor generally, though I agree with @tamird that the actual proposal is light on details. Perhaps you could make this more concrete:
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 65 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Yeah, I also think that case would be possible but we should just remove these hold-overs. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 6 at r2 (raw file):
Mind adding # here? Comments from Reviewable |
|
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 65 at r1 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
What would you replace them with? I believe they're present for cases where we don't have SQL commands for certain operations. Comments from Reviewable |
|
Reviewed 1 of 1 files at r1. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 65 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Worth making a list of code affected by this, but anything that runs a few simple commands can be done server-side via a new endpoint, and code that acts like a real external client perhaps should just go away. Comments from Reviewable |
| which needs access to some of the `TxnCoordSender` functionality but doesn’t | ||
| actually send requests through it. | ||
|
|
||
| This RFC is about agreeing that the current split is useless and inconvenient, |
There was a problem hiding this comment.
Usually difficulty is solved through further splitting rather than merging as this RFC is suggesting. is it possible for you to consider further splitting things apart and understanding the components better before merging them later if you need to?
49f9821 to
a4acc90
Compare
a4acc90 to
70c7282
Compare
|
I've added a few absorption steps. But... I reserve the right to be compiler-driven at times. Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 65 at r1 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
I'm less convinced that having a "coordinator" outside of the cluster is really feasible. For one, you need to be able to assign timestamps according to a cluster's clock. You also need to interface with a node's metrics to record state transitions. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 6 at r2 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
Done. I wondered about that... Will it do anything? docs/RFCS/client.Txn-TxnCoordSender_split.md, line 18 at r2 (raw file): Previously, vivekmenezes wrote…
Well I do understand the components... The thesis is that the components are in our way; they're giving us headaches and no joy. I don't know what else to do than merge as much as possible. But I've listed a few steps at the end, maybe that helps. Comments from Reviewable |
|
While this is definitely a bit different from a standard RFC proposal, I appreciate having the reasoning for the change clearly documented. Thanks @andreimatei Reviewed 1 of 1 files at r3. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 6 at r2 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Nope, it doesn't do anything. If you want it to be a link you have to manually add the link via markdown docs/RFCS/client.Txn-TxnCoordSender_split.md, line 2 at r3 (raw file):
draft? in-progress? docs/RFCS/client.Txn-TxnCoordSender_split.md, line 26 at r3 (raw file):
Comments from Reviewable |
|
I'm not opposed to this move in principle, I'm just concerned about the cost vs benefit. There's going to be a lot of test cleanup work to get everything fully moved over. I felt like the cleanup work wasn't worth it pre-distsql. Maybe distsql's issues with this interface are reason enough to make the move, but I'm not fully convinced. Is there a useful middle ground? For example, maybe we make Review status: all files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 65 at r1 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Yeah, we should rewrite the acceptance tests to use SQL instead of ExternalClient and expose whatever the debugging CLI commands need through either new RPCs or SQL virtual tables. Nothing external to the process should be doing KV transactions. There are also a lot of tests (citation needed) that use transactions via the TxnCoordSender without a client.Txn object. These are in process so there aren't the same issues as ExternalClient, but I think it's still a lot of work to clean it up. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 77 at r2 (raw file):
I'm not sure that we'll ever want to support cancelling individual queries; I think cancellation can always work at the transaction (or session) level. But I agree that it would be cleaner to do the heartbeat at a level that is naturally scoped to the life of the transaction instead of abusing the context passed on the first request. docs/RFCS/client.Txn-TxnCoordSender_split.md, line 106 at r2 (raw file):
But the operations that don't go through TxnCoordSender don't go through client.Txn either, right? So how does merging the two address these issues? Is it just that client.Txn can have a richer interface and let us access its state separately from sending requests through it? Comments from Reviewable |
I like the idea of doing the logical move before the physical one. Review status: all files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. Comments from Reviewable |
|
Reviewed 1 of 1 files at r3. Comments from Reviewable |
Not sure if this is a good idea for production, but alternative impls of this sender is how a bunch of the "read only queries over backups" and "bulk load" prototypes I've built have worked. I'd hate to lose it Review status: all files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. Comments from Reviewable |
|
reminder: don't forget to rebase if you plan to continue work on this, and pick up the new RFC sections + naming convention. |
This PR shows the tooling I used to [stress test the GC queue]. In short, I needed a way to put
large amounts of intents on a single range; I didn't particularly care to do this on a multi-node
cluster, but I needed to do it efficiently for quick turnaround (and also to prevent the GC queue
from cleaning up my garbage faster than I could insert it).
This was also a good opportunity to investigate "better" debugging tools and to revisit the
`ExternalServer` interface, which historically has been the KV store we once wanted to expose to
clients. It has since become internal and is technically slated for removal, but at the same time it
has seen continued use. The reasons for keeping (something like it) are:
1. debug running clusters that are potentially wedged due to invalid KV data. Be able to read
transaction entries and raw KV columns that are unexpected to the SQL layer.
2. in our testing, create problematic conditions that are unattainable by using the public
interfaces (creating artificial GC pressure being one example)
I also think that there's a point to be made to add functionality such as being able to force a
Range to run garbage collection, etc, though that's out of scope here.
In this PR, I've sketched out a TxnCoordSender-level entry point that is tied to a bidirectional
streaming connection. This has the advantage that there is a context available the lifetime of which
is tied to the connection, which means that `TxnCoordSender` can base its transaction heartbeats
based on that (this is not to suggest that we should be running serious transactions through this
interface, but it establishes parity and, assuming that `client.NewSender` went through this
endpoint instead, TxnCoordSender could be simplified to always use the incoming context). There is
more subtlety in this topic since we want to [merge] `TxnCoordSender` and `client.{DB,Txn}` though,
so don't take this as a concrete suggestion.
What's been more immediately useful is a pretty low-level endpoint that allows evaluating a
`BatchRequest` on any given `Replica` (bypassing the command queue, etc) and seeing the results
(more controversially and important for `gcpressurizer` is the ability to *execute* these batches,
something that's quite dangerous in the wrong hands due to the potential of creating inconsistency
and also its insufficient synchronization with splits, etc). I think that's the part worth exploring
since it's a universally useful last resort when things go wrong and visibility into on-disk state
is desired without shutting down the node.
Long story short, I have this code and it's definitely not something to check in, but to discuss.
It'd be nice to programmatically test the GC queue in that way, and perhaps randomly "pollute"
some of our test clusters in ever-escalating ways, to improve their resilience.
[stress test the GC queue]: cockroachdb#9540
[merge]: cockroachdb#16000
|
the work is being done in #28185 |
cc @RaduBerinde @tamird @nvanbenschoten