sql: implement feature parity with Postgres for XA transactions#22359
sql: implement feature parity with Postgres for XA transactions#22359spencerkimball wants to merge 1 commit intocockroachdb:masterfrom
Conversation
|
This change still requires unittests; I've only tested it works manually so far. I would like early feedback on SQL changes as this is the first time I've messed with the grammar or added a plan node. I mostly just cargo culted. |
c2f9b7a to
35fe98c
Compare
35fe98c to
ca95606
Compare
|
Spencer I'm sure this is probably awesome, but unfortunately I won't be able to review in the next two weeks. I'm a bit swamped trying to finish other stuff. If you want it soon, might want to try someone else. Sorry. |
|
fly-by: might wanna mention #22329 Btw, are you sure we want to add this feature now? Was there anything other than that one forum post that made you do this? Seems like a pretty large change... Review status: 0 of 48 files reviewed at latest revision, all discussions resolved, some commit checks failed. Comments from Reviewable |
|
It’s not the first time the question has been asked around XA. If we want to integrate with any externally managed transactions, this is a requirement. In practice this usually means message queues. If you really look at this change, it’s tiny. The bulk is just adding the syntax and a new system table. The pull request here is 10x the actual meaningful changed lines, zero exaggeration. I will add someone else for reviewer. |
andreimatei
left a comment
There was a problem hiding this comment.
One pretty please and one question: please don't make changes to executor.go, I'm trying to get rid of it. At least, make the corresponding changes to conn_fsm.go, which contains the rewritten state machine for a connection.
And the question: does this patch add any protection for prepared txns so their intents can't be pushed? I don't see it, but I guess it should?
|
If you do modify Another random thought - I guess we need to fail the prepare if the txn has queued any schema changes. |
|
@andreimatei it's not clear to me how to add the changes to Also, yes, this provides protection against the intents being pushed. There's a I'm going to have to check in with you when I get a free moment this week to see how to accommodate your suggestion. Added unittests and a mechanism to update the |
ca95606 to
caf1b7f
Compare
|
This doesn't just need unittests - we need to test that it can work in an actual distributed transaction with another system (Ideally in an automated way, but manually might be OK as a stopgap as long as we document how to repeat it). This opens up a lot of compatibility surface area, which our experience with ORM compatibility suggests will take a lot of tweaking. The x/open spec has a lot of rules around when the RM is allowed to forget about a transaction (even after it's been aborted), and about the error codes that are used (the "heuristic commit/abort" cases are supposed to be distinguishable from other outcomes). The new system table containing prepared transactions could grow quite large; I think this may have the potential to be the largest thing we load up in a virtual table, which could lead to memory problems. I'm -1 on implementing this in the near future. Demand for this feature is low (AFAICT), and the testing/support burden is high. |
|
I spent some time looking through the postgres code base today to understand what kind of errors they're returning on various error conditions, like heuristic abort/commits. There are no specific errors being returned. Basically the conclusion I've reached is that the X/Open specification is mostly ignored by Postgres' implementation and that's not as big a deal as it might sound like because in practice, transaction managers merely rely on success or failure responses to the first and second commit phases. A failure on the first phase means the TM won't move to the second phase. A failure on the second phase is assumed to mean the resource manager independently finalized its local txn state and the TM will return a significant failure. Postgres, for example, does not handle requests by the TM to forget prepared transactions. Even if we were to follow the X/Open specification exactly, it wouldn't do any good because the postgres XA drivers that I've looked at don't use any of the extra stuff. My thought was that the prepared transactions table would in fact not get large – at least no larger than an order of magnitude more than the peak number of concurrent distributed transactions in the system. Having these things hang around is bad news and limiting their lifetimes is the main concern of transaction managers. If the TM is in recovery, it also isn't preparing new transactions, so they don't grow without bound just because a TM is offline. Maybe that's hopelessly naive. My larger point for doing this is that it seemed to fit nicely with our existing transaction model, the whole thing is fresh in my head, and it's been requested now three times that I've seen. I agree it needs testing with an actual transaction manager. |
28a3135 to
03eb948
Compare
New syntax for `PREPARE TRANSACTION <gid>`, `COMMIT PREPARED <tid>`, and `ROLLBACK PREPARED <tid>`, for parity with Postgres support. Added a new `system.prepared_xacts` virtual table as well as a `pg_catalog.pg_prepared_xacts`, for compatibility. When a transaction is prepared, we send an `EndTransactionRequest` as usual, but a new flag indicates the transaction should be moved to status `PREPARED` instad of `COMMITTED`. In state `PREPARED`, a transaction record is updated to include all of its intents, and awaits a subsequent `EndTransactionRequest` which will move the transaction either to `COMMITTED` or `ABORTED`. Transactions in state `PREPARED` are given an expiration of `1h` by default. This setting is controlled by `storage.prepared_txn.timeout`. Release note (sql change): XA transaction support allows CockroachDB to participate in distributed transaction with other resources (e.g. databases, message queues, etc) using a two phase commit protocol.
03eb948 to
93053a6
Compare
|
|
|
Any updates on further testing? |
New syntax for
PREPARE TRANSACTION <gid>,COMMIT PREPARED <tid>,and
ROLLBACK PREPARED <tid>, for parity with Postgres support.Added a new
system.prepared_xactsvirtual table as well as apg_catalog.pg_prepared_xacts, for compatibility.When a transaction is prepared, we send an
EndTransactionRequestas usual, but a new flag indicates the transaction should be moved
to status
PREPAREDinstad ofCOMMITTED. In statePREPARED, atransaction record is updated to include all of its intents, and
awaits a subsequent
EndTransactionRequestwhich will move thetransaction either to
COMMITTEDorABORTED. Transaction in statePREPAREDare given an expiration of1hby default. This settingis controlled by
storage.prepared_txn.timeout.Release note (sql change): XA transaction support allows CockroachDB to
participate in distributed transaction with other resources (e.g.
databases, message queues, etc) using a two phase commit protocol.