Skip to content

kv: preemptively refresh transaction timestamps#52884

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
nvb:nvanbenschoten/preemptiveRefresh
Aug 21, 2020
Merged

kv: preemptively refresh transaction timestamps#52884
craig[bot] merged 2 commits intocockroachdb:masterfrom
nvb:nvanbenschoten/preemptiveRefresh

Conversation

@nvb
Copy link
Copy Markdown
Contributor

@nvb nvb commented Aug 16, 2020

Relates to #51294.

This commit adds logic to the txnSpanRefresher to preemptively perform refreshes before issuing requests when doing is guaranteed to be beneficial. We perform a preemptive refresh if either a) doing so would be free because we have not yet accumulated any refresh spans, or b) the batch contains a committing EndTxn request that we know will be rejected if issued.

The first case is straightforward. If the transaction has yet to perform any reads but has had its write timestamp bumped, refreshing is a trivial no-op. In this case, refreshing eagerly prevents the transaction for performing any future reads at its current read timestamp. Not doing so preemptively guarantees that we will need to perform a real refresh in the future if the transaction ever performs a read. At best, this would be wasted work. At worst, this could result in the future refresh failing. So we might as well refresh preemptively while doing so is free.

Note that this first case here does NOT obviate the need for server-side refreshes. Notably, a transaction's write timestamp might be bumped in the same batch in which it performs its first read. In such cases, a preemptive refresh would not be needed but a reactive refresh would not be a trivial no-op. These situations are common for one-phase commit transactions.

The second case is more complex. If the batch contains a committing EndTxn request that we know will need a refresh, we don't want to bother issuing it just for it to be rejected. Instead, preemptively refresh before issuing the EndTxn batch. If we view reads as acquiring a form of optimistic read locks under an optimistic concurrency control scheme (as is discussed in the comment on txnSpanRefresher) then this preemptive refresh immediately before the EndTxn is synonymous with the "validation" phase of a standard OCC transaction model. However, as an optimization compared to standard OCC, the validation phase is only performed when necessary in CockroachDB (i.e. if the transaction's writes have been pushed to higher timestamps).

This second case will play into the solution for #51294. Now that we perform these preemptive refreshes when possible, we know that it is always to right choice to split off the EndTxn from the rest of the batch during a txnSpanRefresher auto-retry. Without this change, it was unclear whether the first refresh of an EndTxn batch was caused by earlier requests in the transaction or by other requests in the current batch. Now, it is always caused by other requests in the same batch, so it is always clear that we should split the EndTxn from the rest of the batch immediately after the first refresh.

Release notes (performance improvement): validation of optimistic reads is now performed earlier in transactions when doing so can save work. This eliminates certain types of transaction retry errors.

@nvb nvb requested a review from andreimatei August 16, 2020 21:49
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@nvb nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from 1ae783e to 924c4ba Compare August 16, 2020 23:02
nvb added a commit to nvb/cockroach that referenced this pull request Aug 17, 2020
Fixes cockroachdb#51294.
First two commits from cockroachdb#52884.

This commit updates the txnSpanRefresher to split off EndTxn requests
into their own partial batches on auto-retries after successful
refreshes as a means of preventing starvation. This avoids starvation in
two ways. First, it helps ensure that we lay down intents if any of the
other requests in the batch are writes. Second, it ensures that if any
writes are getting pushed due to contention with reads or due to the
closed timestamp, they will still succeed and allow the batch to make
forward progress. Without this, each retry attempt may get pushed
because of writes in the batch and then rejected wholesale when the
EndTxn tries to evaluate the pushed batch. When split, the writes will
be pushed but succeed, the transaction will be refreshed, and the EndTxn
will succeed.

I still need to confirm that this fixes this indefinite stall
[here](https://github.com/cockroachlabs/misc_projects_glenn/tree/master/rw_blockage#implicit-query-hangs--explict-query-works),
but I suspect that it will.

Release note (bug fix): A change in v20.1 caused a certain class of bulk
UPDATEs and DELETE statements to hang indefinitely if run in an implicit
transaction. We now break up these statements to avoid starvation and
prevent them from hanging indefinitely.
Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 168 at r2 (raw file):

	ba.CanForwardReadTimestamp = sr.canForwardReadTimestampWithoutRefresh(ba.Txn)

	// If we know that the transaction will need a refresh at some point because

As long as we're writing a long comment here (which I hope can be hidden inside maybeRefreshPreemptively(); see the following comment), mind also putting a TODO(andrei) explaining that whether or not we can still auto-retry at the SQL level should also play a role in deciding whether we want to refresh eagerly or not.


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 395 at r2 (raw file):

		retryErr = preemptiveWriteTooOldErr
	}
	pErr := roachpb.NewErrorWithTxn(retryErr, ba.Txn)

I find it weird that we're creating this error here before we know that we need it (although I appreciate that we don't allocate). I'm also bothered by the fact that only a small part of CanTransactionRetryAtRefreshedTimestamp() seems to apply; that function doesn't seem to fit well here. For one, the no-op handling of the synthetic WriteTooOldError makes the timestamp flow hard to follow I think. I don't think this function should care about "write too old" at all.

I also find the splitting of responsibilities between maybeRefreshPreemptively and its caller pretty surprising: the caller checks that we'd want to refresh, and then this guy checks whether we can refresh.
I would try to do a Family Guy window blinds pulling a bit to see if it'd look better with this function taking on the checking of whether we want to refresh, and on the other hand I'd leave the error creation to the caller. This method could return OK or CommitHopelessBecauseRefreshIsNeededButImpossible. Or I dunno, I'm less sure about where the error creation should go, but one way or another I do think it should go after tryUpdatingTxnSpans returns false.
And then we don't have to argue about whether this function should be called maybe or try or maybeTry.


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 426 at r2 (raw file):

	defer func() {
		if ok {
			sr.refreshSuccess.Inc(1)

I don't think we should increment this when there was nothing to refresh


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher_test.go, line 379 at r2 (raw file):

// it observes a batch containing an EndTxn request that will necessarily throw
// a serializable error.
func TestTxnSpanrefresherPreemptiveRefresh(t *testing.T) {

s/Spanrefresher/SpanRefresher


pkg/kv/kvclient/kvcoord/txn_metrics.go, line 76 at r2 (raw file):

	metaRefreshSuccess = metric.Metadata{
		Name:        "txn.refresh.success",
		Help:        "Number of successful refreshes",

pls add to the help to link to txn.refresh.auto_retries and explain that some refreshes are preemptive and don't lead to a retry.


pkg/kv/kvclient/kvcoord/txn_metrics.go, line 105 at r2 (raw file):

	metaRefreshAutoRetries = metric.Metadata{
		Name:        "txn.refresh.auto_retries",
		Help:        "Number of batch retries after successful refreshes",

nit: s/batch/request

@nvb nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from 924c4ba to e6ef4f5 Compare August 18, 2020 20:15
Copy link
Copy Markdown
Contributor Author

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 168 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

As long as we're writing a long comment here (which I hope can be hidden inside maybeRefreshPreemptively(); see the following comment), mind also putting a TODO(andrei) explaining that whether or not we can still auto-retry at the SQL level should also play a role in deciding whether we want to refresh eagerly or not.

Done.


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 395 at r2 (raw file):

For one, the no-op handling of the synthetic WriteTooOldError makes the timestamp flow hard to follow I think. I don't think this function should care about "write too old" at all.

I don't quite understand what you mean here. This caller is guaranteed to hit the TransactionRetryError case, not the WriteTooOldError case.

I also find the splitting of responsibilities between maybeRefreshPreemptively and its caller pretty surprising: the caller checks that we'd want to refresh, and then this guy checks whether we can refresh.
I would try to do a Family Guy window blinds pulling a bit to see if it'd look better with this function taking on the checking of whether we want to refresh, and on the other hand I'd leave the error creation to the caller. This method could return OK or CommitHopelessBecauseRefreshIsNeededButImpossible. Or I dunno, I'm less sure about where the error creation should go, but one way or another I do think it should go after tryUpdatingTxnSpans returns false.
And then we don't have to argue about whether this function should be called maybe or try or maybeTry.

Done.


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 426 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I don't think we should increment this when there was nothing to refresh

We were before. You think we should change behavior? I tend to disagree because this still fits the definition of a "refresh". If we want to see how many refresh requests were sent, we have a metric for that too.


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher_test.go, line 379 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

s/Spanrefresher/SpanRefresher

Done.


pkg/kv/kvclient/kvcoord/txn_metrics.go, line 76 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

pls add to the help to link to txn.refresh.auto_retries and explain that some refreshes are preemptive and don't lead to a retry.

Done.


pkg/kv/kvclient/kvcoord/txn_metrics.go, line 105 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: s/batch/request

Done.

Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 395 at r2 (raw file):

For one, the no-op handling of the synthetic WriteTooOldError makes the timestamp flow hard to follow I think. I don't think this function should care about "write too old" at all.

I don't quite understand what you mean here. This caller is guaranteed to hit the TransactionRetryError case, not the WriteTooOldError case.

Then I'm even more bothered by calling that fat function. But now that I look more, I think the preemptiveWriteTooOldErr can't be triggered because this interceptor terminates the WriteTooOld flag. Let's just assert that ba.Txn.WriteTooOld is not set, like we do in sendLockedWithRefreshAttempts.

Even if we get rid of preemptiveWriteTooOldErr, I'd then stil prefer to get rid of preemptiveSerializableErr too; I would break CanTransactionRetryAtRefreshedTimestamp() into two: an inner function that takes a retriable error and gives you back the timestamp that the error wants the client to retry at, and an outer function that takes an error and a timestamp and prepares the transaction for retrying at that given timestamp (i.e.

func prepareForRetry(txn, timestamp) (bool, txn)
if txn.CommitTimestampFixed {
  return false
}
newTxn := txn.Clone()
newTxn.WriteTimestamp.Forward(timestamp)
newTxn.ReadTimestamp.Forward(newTxn.WriteTimestamp)
newTxn.WriteTooOld = false

maybeRefreshPreemptively() would then do
txn, ok := prepareForRetry(txn, ba.Txn.WriteTime)

The handling of write timestamp is just too obfuscated through the pErr := roachpb.NewErrorWithTxn(retryErr, ba.Txn) line, for no good reason IMO.


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 426 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We were before. You think we should change behavior? I tend to disagree because this still fits the definition of a "refresh". If we want to see how many refresh requests were sent, we have a metric for that too.

ok

@nvb nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from e6ef4f5 to 92ee907 Compare August 20, 2020 03:09
Copy link
Copy Markdown
Contributor Author

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 395 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

For one, the no-op handling of the synthetic WriteTooOldError makes the timestamp flow hard to follow I think. I don't think this function should care about "write too old" at all.

I don't quite understand what you mean here. This caller is guaranteed to hit the TransactionRetryError case, not the WriteTooOldError case.

Then I'm even more bothered by calling that fat function. But now that I look more, I think the preemptiveWriteTooOldErr can't be triggered because this interceptor terminates the WriteTooOld flag. Let's just assert that ba.Txn.WriteTooOld is not set, like we do in sendLockedWithRefreshAttempts.

Even if we get rid of preemptiveWriteTooOldErr, I'd then stil prefer to get rid of preemptiveSerializableErr too; I would break CanTransactionRetryAtRefreshedTimestamp() into two: an inner function that takes a retriable error and gives you back the timestamp that the error wants the client to retry at, and an outer function that takes an error and a timestamp and prepares the transaction for retrying at that given timestamp (i.e.

func prepareForRetry(txn, timestamp) (bool, txn)
if txn.CommitTimestampFixed {
  return false
}
newTxn := txn.Clone()
newTxn.WriteTimestamp.Forward(timestamp)
newTxn.ReadTimestamp.Forward(newTxn.WriteTimestamp)
newTxn.WriteTooOld = false

maybeRefreshPreemptively() would then do
txn, ok := prepareForRetry(txn, ba.Txn.WriteTime)

The handling of write timestamp is just too obfuscated through the pErr := roachpb.NewErrorWithTxn(retryErr, ba.Txn) line, for no good reason IMO.

Ok, I did pretty much exactly what you're suggesting here. I think you'll like the result. PTAL.

And if so, do you mind giving the third commit in #52885 a look while you have this paged in? That needs to land before the stability period.

@nvb nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from 92ee907 to dab1090 Compare August 20, 2020 18:31
Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Aug 20, 2020

bors r+

@nvb nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from dab1090 to 1d461ba Compare August 20, 2020 22:07
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 20, 2020

Canceled.

@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Aug 20, 2020

bors r+

nvb added 2 commits August 20, 2020 20:20
The field was replaced with the more general BatchRequest.CanForwardReadTimestamp
in 972915d. This commit completes the migration to remove the old flag, which is
possible now that we're building towards v20.2.
Relates to cockroachdb#51294.

This commit adds logic to the txnSpanRefresher to preemptively perform
refreshes before issuing requests when doing is guaranteed to be
beneficial. We perform a preemptive refresh if either a) doing so would
be free because we have not yet accumulated any refresh spans, or b) the
batch contains a committing EndTxn request that we know will be rejected
if issued.

The first case is straightforward. If the transaction has yet to perform
any reads but has had its write timestamp bumped, refreshing is a
trivial no-op. In this case, refreshing eagerly prevents the transaction
for performing any future reads at its current read timestamp. Not doing
so preemptively guarantees that we will need to perform a real refresh
in the future if the transaction ever performs a read. At best, this
would be wasted work. At worst, this could result in the future refresh
failing. So we might as well refresh preemptively while doing so is
free.

Note that this first case here does NOT obviate the need for server-side
refreshes. Notably, a transaction's write timestamp might be bumped in
the same batch in which it performs its first read. In such cases, a
preemptive refresh would not be needed but a reactive refresh would not
be a trivial no-op. These situations are common for one-phase commit
transactions.

The second case is more complex. If the batch contains a committing
EndTxn request that we know will need a refresh, we don't want to bother
issuing it just for it to be rejected. Instead, preemptively refresh
before issuing the EndTxn batch. If we view reads as acquiring a form of
optimistic read locks under an optimistic concurrency control scheme (as
is discussed in the comment on txnSpanRefresher) then this preemptive
refresh immediately before the EndTxn is synonymous with the
"validation" phase of a standard OCC transaction model. However, as an
optimization compared to standard OCC, the validation phase is only
performed when necessary in CockroachDB (i.e. if the transaction's
writes have been pushed to higher timestamps).

This second case will play into the solution for cockroachdb#51294. Now that we
perform these preemptive refreshes when possible, we know that it is
always to right choice to split off the EndTxn from the rest of the
batch during a txnSpanRefresher auto-retry. Without this change, it was
unclear whether the first refresh of an EndTxn batch was caused by
earlier requests in the transaction or by other requests in the current
batch. Now, it is always caused by other requests in the same batch, so
it is always clear that we should split the EndTxn from the rest of the
batch immediately after the first refresh.

Release notes (performance improvement): validation of optimistic reads
is now performed earlier in transactions when doing so can save work.
This eliminates certain types of transaction retry errors and avoids
wasted RPC traffic.
@nvb nvb force-pushed the nvanbenschoten/preemptiveRefresh branch from 1d461ba to 3c2524f Compare August 21, 2020 00:20
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 21, 2020

Canceled.

@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Aug 21, 2020

api.pb.go keeps changing on me!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 21, 2020

Build succeeded:

@craig craig bot merged commit ec01d94 into cockroachdb:master Aug 21, 2020
@nvb nvb deleted the nvanbenschoten/preemptiveRefresh branch August 21, 2020 04:47
nvb added a commit to nvb/cockroach that referenced this pull request Aug 21, 2020
Fixes cockroachdb#51294.
First two commits from cockroachdb#52884.

This commit updates the txnSpanRefresher to split off EndTxn requests
into their own partial batches on auto-retries after successful
refreshes as a means of preventing starvation. This avoids starvation in
two ways. First, it helps ensure that we lay down intents if any of the
other requests in the batch are writes. Second, it ensures that if any
writes are getting pushed due to contention with reads or due to the
closed timestamp, they will still succeed and allow the batch to make
forward progress. Without this, each retry attempt may get pushed
because of writes in the batch and then rejected wholesale when the
EndTxn tries to evaluate the pushed batch. When split, the writes will
be pushed but succeed, the transaction will be refreshed, and the EndTxn
will succeed.

I still need to confirm that this fixes this indefinite stall
[here](https://github.com/cockroachlabs/misc_projects_glenn/tree/master/rw_blockage#implicit-query-hangs--explict-query-works),
but I suspect that it will.

Release note (bug fix): A change in v20.1 caused a certain class of bulk
UPDATEs and DELETE statements to hang indefinitely if run in an implicit
transaction. We now break up these statements to avoid starvation and
prevent them from hanging indefinitely.
nvb added a commit to nvb/cockroach that referenced this pull request Aug 31, 2020
Fixes cockroachdb#51294.
First two commits from cockroachdb#52884.

This commit updates the txnSpanRefresher to split off EndTxn requests
into their own partial batches on auto-retries after successful
refreshes as a means of preventing starvation. This avoids starvation in
two ways. First, it helps ensure that we lay down intents if any of the
other requests in the batch are writes. Second, it ensures that if any
writes are getting pushed due to contention with reads or due to the
closed timestamp, they will still succeed and allow the batch to make
forward progress. Without this, each retry attempt may get pushed
because of writes in the batch and then rejected wholesale when the
EndTxn tries to evaluate the pushed batch. When split, the writes will
be pushed but succeed, the transaction will be refreshed, and the EndTxn
will succeed.

I still need to confirm that this fixes this indefinite stall
[here](https://github.com/cockroachlabs/misc_projects_glenn/tree/master/rw_blockage#implicit-query-hangs--explict-query-works),
but I suspect that it will.

Release note (bug fix): A change in v20.1 caused a certain class of bulk
UPDATEs and DELETE statements to hang indefinitely if run in an implicit
transaction. We now break up these statements to avoid starvation and
prevent them from hanging indefinitely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants