kv: preemptively refresh transaction timestamps by nvb · Pull Request #52884 · cockroachdb/cockroach

nvb · 2020-08-16T21:49:05Z

Relates to #51294.

This commit adds logic to the txnSpanRefresher to preemptively perform refreshes before issuing requests when doing is guaranteed to be beneficial. We perform a preemptive refresh if either a) doing so would be free because we have not yet accumulated any refresh spans, or b) the batch contains a committing EndTxn request that we know will be rejected if issued.

The first case is straightforward. If the transaction has yet to perform any reads but has had its write timestamp bumped, refreshing is a trivial no-op. In this case, refreshing eagerly prevents the transaction for performing any future reads at its current read timestamp. Not doing so preemptively guarantees that we will need to perform a real refresh in the future if the transaction ever performs a read. At best, this would be wasted work. At worst, this could result in the future refresh failing. So we might as well refresh preemptively while doing so is free.

Note that this first case here does NOT obviate the need for server-side refreshes. Notably, a transaction's write timestamp might be bumped in the same batch in which it performs its first read. In such cases, a preemptive refresh would not be needed but a reactive refresh would not be a trivial no-op. These situations are common for one-phase commit transactions.

The second case is more complex. If the batch contains a committing EndTxn request that we know will need a refresh, we don't want to bother issuing it just for it to be rejected. Instead, preemptively refresh before issuing the EndTxn batch. If we view reads as acquiring a form of optimistic read locks under an optimistic concurrency control scheme (as is discussed in the comment on txnSpanRefresher) then this preemptive refresh immediately before the EndTxn is synonymous with the "validation" phase of a standard OCC transaction model. However, as an optimization compared to standard OCC, the validation phase is only performed when necessary in CockroachDB (i.e. if the transaction's writes have been pushed to higher timestamps).

This second case will play into the solution for #51294. Now that we perform these preemptive refreshes when possible, we know that it is always to right choice to split off the EndTxn from the rest of the batch during a txnSpanRefresher auto-retry. Without this change, it was unclear whether the first refresh of an EndTxn batch was caused by earlier requests in the transaction or by other requests in the current batch. Now, it is always caused by other requests in the same batch, so it is always clear that we should split the EndTxn from the rest of the batch immediately after the first refresh.

Release notes (performance improvement): validation of optimistic reads is now performed earlier in transactions when doing so can save work. This eliminates certain types of transaction retry errors.

cockroach-teamcity · 2020-08-16T21:49:13Z

This change is

Fixes cockroachdb#51294. First two commits from cockroachdb#52884. This commit updates the txnSpanRefresher to split off EndTxn requests into their own partial batches on auto-retries after successful refreshes as a means of preventing starvation. This avoids starvation in two ways. First, it helps ensure that we lay down intents if any of the other requests in the batch are writes. Second, it ensures that if any writes are getting pushed due to contention with reads or due to the closed timestamp, they will still succeed and allow the batch to make forward progress. Without this, each retry attempt may get pushed because of writes in the batch and then rejected wholesale when the EndTxn tries to evaluate the pushed batch. When split, the writes will be pushed but succeed, the transaction will be refreshed, and the EndTxn will succeed. I still need to confirm that this fixes this indefinite stall [here](https://github.com/cockroachlabs/misc_projects_glenn/tree/master/rw_blockage#implicit-query-hangs--explict-query-works), but I suspect that it will. Release note (bug fix): A change in v20.1 caused a certain class of bulk UPDATEs and DELETE statements to hang indefinitely if run in an implicit transaction. We now break up these statements to avoid starvation and prevent them from hanging indefinitely.

andreimatei

Good stuff

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 168 at r2 (raw file):

	ba.CanForwardReadTimestamp = sr.canForwardReadTimestampWithoutRefresh(ba.Txn)

	// If we know that the transaction will need a refresh at some point because

As long as we're writing a long comment here (which I hope can be hidden inside maybeRefreshPreemptively(); see the following comment), mind also putting a TODO(andrei) explaining that whether or not we can still auto-retry at the SQL level should also play a role in deciding whether we want to refresh eagerly or not.

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 395 at r2 (raw file):

		retryErr = preemptiveWriteTooOldErr
	}
	pErr := roachpb.NewErrorWithTxn(retryErr, ba.Txn)

I find it weird that we're creating this error here before we know that we need it (although I appreciate that we don't allocate). I'm also bothered by the fact that only a small part of CanTransactionRetryAtRefreshedTimestamp() seems to apply; that function doesn't seem to fit well here. For one, the no-op handling of the synthetic WriteTooOldError makes the timestamp flow hard to follow I think. I don't think this function should care about "write too old" at all.

I also find the splitting of responsibilities between maybeRefreshPreemptively and its caller pretty surprising: the caller checks that we'd want to refresh, and then this guy checks whether we can refresh.
I would try to do a Family Guy window blinds pulling a bit to see if it'd look better with this function taking on the checking of whether we want to refresh, and on the other hand I'd leave the error creation to the caller. This method could return OK or CommitHopelessBecauseRefreshIsNeededButImpossible. Or I dunno, I'm less sure about where the error creation should go, but one way or another I do think it should go after tryUpdatingTxnSpans returns false.
And then we don't have to argue about whether this function should be called maybe or try or maybeTry.

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 426 at r2 (raw file):

	defer func() {
		if ok {
			sr.refreshSuccess.Inc(1)

I don't think we should increment this when there was nothing to refresh

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher_test.go, line 379 at r2 (raw file):

// it observes a batch containing an EndTxn request that will necessarily throw
// a serializable error.
func TestTxnSpanrefresherPreemptiveRefresh(t *testing.T) {

s/Spanrefresher/SpanRefresher

pkg/kv/kvclient/kvcoord/txn_metrics.go, line 76 at r2 (raw file):

	metaRefreshSuccess = metric.Metadata{
		Name:        "txn.refresh.success",
		Help:        "Number of successful refreshes",

pls add to the help to link to txn.refresh.auto_retries and explain that some refreshes are preemptive and don't lead to a retry.

pkg/kv/kvclient/kvcoord/txn_metrics.go, line 105 at r2 (raw file):

	metaRefreshAutoRetries = metric.Metadata{
		Name:        "txn.refresh.auto_retries",
		Help:        "Number of batch retries after successful refreshes",

nit: s/batch/request

nvb

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 168 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

As long as we're writing a long comment here (which I hope can be hidden inside maybeRefreshPreemptively(); see the following comment), mind also putting a TODO(andrei) explaining that whether or not we can still auto-retry at the SQL level should also play a role in deciding whether we want to refresh eagerly or not.

Done.

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 395 at r2 (raw file):

For one, the no-op handling of the synthetic WriteTooOldError makes the timestamp flow hard to follow I think. I don't think this function should care about "write too old" at all.

I don't quite understand what you mean here. This caller is guaranteed to hit the TransactionRetryError case, not the WriteTooOldError case.

I also find the splitting of responsibilities between maybeRefreshPreemptively and its caller pretty surprising: the caller checks that we'd want to refresh, and then this guy checks whether we can refresh.
I would try to do a Family Guy window blinds pulling a bit to see if it'd look better with this function taking on the checking of whether we want to refresh, and on the other hand I'd leave the error creation to the caller. This method could return OK or CommitHopelessBecauseRefreshIsNeededButImpossible. Or I dunno, I'm less sure about where the error creation should go, but one way or another I do think it should go after tryUpdatingTxnSpans returns false.
And then we don't have to argue about whether this function should be called maybe or try or maybeTry.

Done.

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 426 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I don't think we should increment this when there was nothing to refresh

We were before. You think we should change behavior? I tend to disagree because this still fits the definition of a "refresh". If we want to see how many refresh requests were sent, we have a metric for that too.

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher_test.go, line 379 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

s/Spanrefresher/SpanRefresher

Done.

pkg/kv/kvclient/kvcoord/txn_metrics.go, line 76 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

pls add to the help to link to txn.refresh.auto_retries and explain that some refreshes are preemptive and don't lead to a retry.

Done.

pkg/kv/kvclient/kvcoord/txn_metrics.go, line 105 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: s/batch/request

Done.

andreimatei

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 395 at r2 (raw file):

For one, the no-op handling of the synthetic WriteTooOldError makes the timestamp flow hard to follow I think. I don't think this function should care about "write too old" at all.

I don't quite understand what you mean here. This caller is guaranteed to hit the TransactionRetryError case, not the WriteTooOldError case.

Then I'm even more bothered by calling that fat function. But now that I look more, I think the preemptiveWriteTooOldErr can't be triggered because this interceptor terminates the WriteTooOld flag. Let's just assert that ba.Txn.WriteTooOld is not set, like we do in sendLockedWithRefreshAttempts.

Even if we get rid of preemptiveWriteTooOldErr, I'd then stil prefer to get rid of preemptiveSerializableErr too; I would break CanTransactionRetryAtRefreshedTimestamp() into two: an inner function that takes a retriable error and gives you back the timestamp that the error wants the client to retry at, and an outer function that takes an error and a timestamp and prepares the transaction for retrying at that given timestamp (i.e.

func prepareForRetry(txn, timestamp) (bool, txn)
if txn.CommitTimestampFixed {
  return false
}
newTxn := txn.Clone()
newTxn.WriteTimestamp.Forward(timestamp)
newTxn.ReadTimestamp.Forward(newTxn.WriteTimestamp)
newTxn.WriteTooOld = false

maybeRefreshPreemptively() would then do
txn, ok := prepareForRetry(txn, ba.Txn.WriteTime)

The handling of write timestamp is just too obfuscated through the pErr := roachpb.NewErrorWithTxn(retryErr, ba.Txn) line, for no good reason IMO.

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 426 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We were before. You think we should change behavior? I tend to disagree because this still fits the definition of a "refresh". If we want to see how many refresh requests were sent, we have a metric for that too.

ok

nvb

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 395 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

For one, the no-op handling of the synthetic WriteTooOldError makes the timestamp flow hard to follow I think. I don't think this function should care about "write too old" at all.

I don't quite understand what you mean here. This caller is guaranteed to hit the TransactionRetryError case, not the WriteTooOldError case.

Then I'm even more bothered by calling that fat function. But now that I look more, I think the preemptiveWriteTooOldErr can't be triggered because this interceptor terminates the WriteTooOld flag. Let's just assert that ba.Txn.WriteTooOld is not set, like we do in sendLockedWithRefreshAttempts.

Even if we get rid of preemptiveWriteTooOldErr, I'd then stil prefer to get rid of preemptiveSerializableErr too; I would break CanTransactionRetryAtRefreshedTimestamp() into two: an inner function that takes a retriable error and gives you back the timestamp that the error wants the client to retry at, and an outer function that takes an error and a timestamp and prepares the transaction for retrying at that given timestamp (i.e.
func prepareForRetry(txn, timestamp) (bool, txn)
if txn.CommitTimestampFixed {
  return false
}
newTxn := txn.Clone()
newTxn.WriteTimestamp.Forward(timestamp)
newTxn.ReadTimestamp.Forward(newTxn.WriteTimestamp)
newTxn.WriteTooOld = false
maybeRefreshPreemptively() would then do
txn, ok := prepareForRetry(txn, ba.Txn.WriteTime)

The handling of write timestamp is just too obfuscated through the pErr := roachpb.NewErrorWithTxn(retryErr, ba.Txn) line, for no good reason IMO.

Ok, I did pretty much exactly what you're suggesting here. I think you'll like the result. PTAL.

And if so, do you mind giving the third commit in #52885 a look while you have this paged in? That needs to land before the stability period.

andreimatei

LGTM thanks

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

nvb · 2020-08-20T21:18:41Z

bors r+

craig · 2020-08-20T22:07:10Z

Canceled.

nvb · 2020-08-20T22:07:19Z

bors r+

The field was replaced with the more general BatchRequest.CanForwardReadTimestamp in 972915d. This commit completes the migration to remove the old flag, which is possible now that we're building towards v20.2.

Relates to cockroachdb#51294. This commit adds logic to the txnSpanRefresher to preemptively perform refreshes before issuing requests when doing is guaranteed to be beneficial. We perform a preemptive refresh if either a) doing so would be free because we have not yet accumulated any refresh spans, or b) the batch contains a committing EndTxn request that we know will be rejected if issued. The first case is straightforward. If the transaction has yet to perform any reads but has had its write timestamp bumped, refreshing is a trivial no-op. In this case, refreshing eagerly prevents the transaction for performing any future reads at its current read timestamp. Not doing so preemptively guarantees that we will need to perform a real refresh in the future if the transaction ever performs a read. At best, this would be wasted work. At worst, this could result in the future refresh failing. So we might as well refresh preemptively while doing so is free. Note that this first case here does NOT obviate the need for server-side refreshes. Notably, a transaction's write timestamp might be bumped in the same batch in which it performs its first read. In such cases, a preemptive refresh would not be needed but a reactive refresh would not be a trivial no-op. These situations are common for one-phase commit transactions. The second case is more complex. If the batch contains a committing EndTxn request that we know will need a refresh, we don't want to bother issuing it just for it to be rejected. Instead, preemptively refresh before issuing the EndTxn batch. If we view reads as acquiring a form of optimistic read locks under an optimistic concurrency control scheme (as is discussed in the comment on txnSpanRefresher) then this preemptive refresh immediately before the EndTxn is synonymous with the "validation" phase of a standard OCC transaction model. However, as an optimization compared to standard OCC, the validation phase is only performed when necessary in CockroachDB (i.e. if the transaction's writes have been pushed to higher timestamps). This second case will play into the solution for cockroachdb#51294. Now that we perform these preemptive refreshes when possible, we know that it is always to right choice to split off the EndTxn from the rest of the batch during a txnSpanRefresher auto-retry. Without this change, it was unclear whether the first refresh of an EndTxn batch was caused by earlier requests in the transaction or by other requests in the current batch. Now, it is always caused by other requests in the same batch, so it is always clear that we should split the EndTxn from the rest of the batch immediately after the first refresh. Release notes (performance improvement): validation of optimistic reads is now performed earlier in transactions when doing so can save work. This eliminates certain types of transaction retry errors and avoids wasted RPC traffic.

craig · 2020-08-21T00:20:28Z

Canceled.

nvb · 2020-08-21T00:20:50Z

api.pb.go keeps changing on me!

bors r+

craig · 2020-08-21T01:18:32Z

Build succeeded:

GitHub CI (Cockroach)

Fixes cockroachdb#51294. First two commits from cockroachdb#52884. This commit updates the txnSpanRefresher to split off EndTxn requests into their own partial batches on auto-retries after successful refreshes as a means of preventing starvation. This avoids starvation in two ways. First, it helps ensure that we lay down intents if any of the other requests in the batch are writes. Second, it ensures that if any writes are getting pushed due to contention with reads or due to the closed timestamp, they will still succeed and allow the batch to make forward progress. Without this, each retry attempt may get pushed because of writes in the batch and then rejected wholesale when the EndTxn tries to evaluate the pushed batch. When split, the writes will be pushed but succeed, the transaction will be refreshed, and the EndTxn will succeed. I still need to confirm that this fixes this indefinite stall [here](https://github.com/cockroachlabs/misc_projects_glenn/tree/master/rw_blockage#implicit-query-hangs--explict-query-works), but I suspect that it will. Release note (bug fix): A change in v20.1 caused a certain class of bulk UPDATEs and DELETE statements to hang indefinitely if run in an implicit transaction. We now break up these statements to avoid starvation and prevent them from hanging indefinitely.

nvb requested a review from andreimatei August 16, 2020 21:49

nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from 1ae783e to 924c4ba Compare August 16, 2020 23:02

nvb mentioned this pull request Aug 17, 2020

kv: split EndTxn into sub-batch on auto-retry after successful refresh #52885

Merged

andreimatei reviewed Aug 17, 2020

View reviewed changes

nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from 924c4ba to e6ef4f5 Compare August 18, 2020 20:15

nvb commented Aug 18, 2020

View reviewed changes

andreimatei reviewed Aug 19, 2020

View reviewed changes

nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from e6ef4f5 to 92ee907 Compare August 20, 2020 03:09

nvb commented Aug 20, 2020

View reviewed changes

nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from 92ee907 to dab1090 Compare August 20, 2020 18:31

andreimatei reviewed Aug 20, 2020

View reviewed changes

nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from dab1090 to 1d461ba Compare August 20, 2020 22:07

nvb added 2 commits August 20, 2020 20:20

kv: remove EndTxn.CanCommitAtHigherTimestamp flag

93d5eb9

The field was replaced with the more general BatchRequest.CanForwardReadTimestamp in 972915d. This commit completes the migration to remove the old flag, which is possible now that we're building towards v20.2.

nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from 1d461ba to 3c2524f Compare August 21, 2020 00:20

craig bot merged commit ec01d94 into cockroachdb:master Aug 21, 2020

nvb deleted the nvanbenschoten/preemptiveRefresh branch August 21, 2020 04:47

nvb mentioned this pull request Aug 27, 2020

release-20.1: kv: split EndTxn into sub-batch on auto-retry after successful refresh #53561

Merged

nvb mentioned this pull request Aug 31, 2020

roachtest: tpccbench/nodes=6/cpu=16/multi-az failed #53469

Closed

thoszhang mentioned this pull request Oct 19, 2020

sql/logictest: schema_change_in_txn gets TransactionRetry error #53724

Closed

yuzefovich mentioned this pull request Oct 19, 2020

sql/logictest: TestLogic/fk_opt failed missing back reference #55537

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv: preemptively refresh transaction timestamps#52884

kv: preemptively refresh transaction timestamps#52884
craig[bot] merged 2 commits intocockroachdb:masterfrom
nvb:nvanbenschoten/preemptiveRefresh

nvb commented Aug 16, 2020

Uh oh!

cockroach-teamcity commented Aug 16, 2020

Uh oh!

andreimatei left a comment

Uh oh!

nvb left a comment

Uh oh!

andreimatei left a comment

Uh oh!

nvb left a comment

Uh oh!

andreimatei left a comment

Uh oh!

nvb commented Aug 20, 2020

Uh oh!

craig bot commented Aug 20, 2020

Uh oh!

nvb commented Aug 20, 2020

Uh oh!

craig bot commented Aug 21, 2020

Uh oh!

nvb commented Aug 21, 2020

Uh oh!

craig bot commented Aug 21, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nvb commented Aug 16, 2020

Uh oh!

cockroach-teamcity commented Aug 16, 2020

Uh oh!

andreimatei left a comment

Choose a reason for hiding this comment

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

andreimatei left a comment

Choose a reason for hiding this comment

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

andreimatei left a comment

Choose a reason for hiding this comment

Uh oh!

nvb commented Aug 20, 2020

Uh oh!

craig bot commented Aug 20, 2020

Uh oh!

nvb commented Aug 20, 2020

Uh oh!

craig bot commented Aug 21, 2020

Uh oh!

nvb commented Aug 21, 2020

Uh oh!

craig bot commented Aug 21, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants