-
Notifications
You must be signed in to change notification settings - Fork 4.1k
stability: Support for large transactions #15849
Description
Forked from @andreimatei's comment on #15713:
Unfortunately the "transaction is too large to commit" is by design - currently an EndTransaction carries all the intents, and so we need to be reasonably sure that each replica will have enough memory to process the corresponding Raft command.
This is a topic that I've randomly discussed at the watercooler with the likes of @bdarnell and @tschottdorf at times. Dumping stuff here for institutional memory.
There are various things we could do: one way or another, intents (or, at least, all intents) shouldn't have to be attached to an EndTransaction request. They could be stored in the db and the cleanup happening after EndTransaction has marked the txn record as committed could retrieve them from there. For example, they could be stored in the transaction record itself. The TxnCoordSender could keep appending intent spans when it heartbeats the record, for example. That might not be sufficient, since you'll end up with a humongous txn record. What'd be better is to store them adjacently to the txn rec, such that the range can be split, etc. But then you prolly need to be careful with the orchestration of the cleanup process and the cleanup of these intent records themselves.Without, or besides, storing the intents into the db, the client (the client.Txn layer) could be more involved in the cleanup of such large txns; it could store intents in memory and not attach them to EndTransaction, but do cleanup afterwards. It could keep the intents below a memory budget by having a limited-size interval tree and merging spans when it needs to - trading pessimistic scanning at cleanup time for memory at execution time. This would probably work great for cases as the one here (I assume): copying a table.
An approach like this would probably be made more feasible by the merging of the TxnCoordSender (which currently keeps track of intents) and client.Txn that I'm hoping to do.@petermattis, I'm assigning back to you... But I will keep an eye on supporting work here when I do TxnCoordSender refactorings.
Incidentally - here's one idea about the codelab we were talking about: implement CREATE TABLE AS , or a version of INSERT INTO... SELECT... that does some limited version of the above - doesn't accumulate intents and scans the whole table for cleanup.