storage: CPut ConditionFailed updates the timestamp cache by spencerkimball · Pull Request #21297 · cockroachdb/cockroach

spencerkimball · 2018-01-07T12:26:35Z

Here's an example:

Txn0 @ts=1 does an insert of key A, which does a CPut on primary key A being empty, which succeeds and A is inserted.
Txn1 @ts=10 does an insert of key A, but gets a condition failed error because key A exists.
Txn2 @ts=5 deletes key A (which is allowed @ts=5 because txn1 didn't update timestamp cache), rewriting history!

Release note: None

cockroach-teamcity · 2018-01-07T12:26:41Z

This change is

Release note: None

nvb · 2018-01-08T04:57:19Z

this seems pretty bad, nice catch! What led you to find this? Was an anomaly visible from the SQL layer?

Also, do you mind expanding the commit message to point out that this fixes a serializability violation with CPuts? Maybe also expand on what the implications of that violation were.

Review status: 0 of 3 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.

pkg/storage/replica.go, line 2055 at r1 (raw file):

			// Skip update if there's an error and it's not for this index
			// or the request doesn't update the timestamp cache on errors.
			if pErr != nil {

I'd restructure this so that the pErr != nil check is outside of the UpdatesTimestampCache so that commands can have the flag updatesTSCacheOnError without also needing to have the flag updatesTSCache.

Comments from Reviewable

jordanlewis · 2018-01-08T14:37:37Z

Is this a bug fix or an improvement? As I understood, CPut ConditionFailed didn't need to update the timestamp cache because the transaction was always ended after a ConditionFailed error was detected.

Review status: 0 of 3 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.

Comments from Reviewable

spencerkimball · 2018-01-08T14:52:44Z

My POC on improving serializable restarts caused me to take a closer look at how things must update the timestamp cache and this popped out as being obviously wrong. @jordanlewis, imagine the following example:

Txn0 @ts=1 does an insert of key A, which does a CPut on primary key A being empty, which succeeds and A is inserted.
Txn1 @ts=10 does an insert of key A, but gets a condition failed error because key A exists.
Txn2 @ts=5 deletes key A (which is allowed @ts=5 because txn1 didn't update timestamp cache), rewriting history!

@nvanbencschoten, I added the above example to the commit message.

Review status: 0 of 3 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.

pkg/storage/replica.go, line 2055 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'd restructure this so that the pErr != nil check is outside of the UpdatesTimestampCache so that commands can have the flag updatesTSCacheOnError without also needing to have the flag updatesTSCache.

The chance of that being something we need seems vanishingly unlikely.

Comments from Reviewable

jordanlewis · 2018-01-08T18:15:51Z

Thanks for the clear example! That makes sense. Yikes! I wonder if any of our Jepsen tests include workloads that exercise conditional put failures. Would this kind of thing have been caught if they did include such tests?

Review status: 0 of 3 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.

Comments from Reviewable

bdarnell · 2018-01-08T19:36:16Z

I don't think we have any jepsen tests that exercise this, but it should be possible to construct a test that would catch it.

Review status: 0 of 3 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

pkg/storage/replica.go, line 2074 at r1 (raw file):

				key := keys.TransactionKey(start, txnID)
				tc.Add(key, nil, ts, txnID, false /* readCache */)
			case *roachpb.ConditionalPutRequest:

It's unfortunate that we have both a flag in api.go and an entry in the switch statement here - the flags are less valuable if they just need to be synchronized with other code.

Of course, this switch case isn't strictly necessary - it would still be correct to update the tscache for any error from ConditionalPut (or even for any error on any command that updates the tscache). Should we just do that instead of trying to narrowly target certain errors from certain commands? We'd push the timestamp cache forward more than we otherwise would, but it doesn't seem like that would be much of a problem.

Comments from Reviewable

spencerkimball · 2018-01-08T19:59:33Z

Review status: 0 of 3 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

pkg/storage/replica.go, line 2074 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

It's unfortunate that we have both a flag in api.go and an entry in the switch statement here - the flags are less valuable if they just need to be synchronized with other code.

Of course, this switch case isn't strictly necessary - it would still be correct to update the tscache for any error from ConditionalPut (or even for any error on any command that updates the tscache). Should we just do that instead of trying to narrowly target certain errors from certain commands? We'd push the timestamp cache forward more than we otherwise would, but it doesn't seem like that would be much of a problem.

I really think it's safer to avoid changes to existing behavior where possible. Given that, I really needed to add this case in order to examine the error in the context of the request.

Comments from Reviewable

spencerkimball requested review from a team and bdarnell January 7, 2018 12:26

storage: CPut ConditionFailed updates the timestamp cache

70ef0e1

Release note: None

spencerkimball force-pushed the cond-put-errors branch from 414e612 to 70ef0e1 Compare January 7, 2018 20:04

spencerkimball merged commit e81c37d into cockroachdb:master Jan 8, 2018

spencerkimball deleted the cond-put-errors branch January 8, 2018 19:59

bdarnell mentioned this pull request Jan 9, 2018

Serializable txn improvements #21140

Merged

vivekmenezes mentioned this pull request Jan 12, 2018

sql: index creation is not creating the correct number of index entries #18533

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: CPut ConditionFailed updates the timestamp cache#21297

storage: CPut ConditionFailed updates the timestamp cache#21297
spencerkimball merged 1 commit intocockroachdb:masterfrom
spencerkimball:cond-put-errors

spencerkimball commented Jan 7, 2018 •

edited

Loading

Uh oh!

cockroach-teamcity commented Jan 7, 2018

Uh oh!

nvb commented Jan 8, 2018

Uh oh!

jordanlewis commented Jan 8, 2018

Uh oh!

spencerkimball commented Jan 8, 2018

Uh oh!

jordanlewis commented Jan 8, 2018

Uh oh!

bdarnell commented Jan 8, 2018

Uh oh!

spencerkimball commented Jan 8, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

spencerkimball commented Jan 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Jan 7, 2018

Uh oh!

nvb commented Jan 8, 2018

Uh oh!

jordanlewis commented Jan 8, 2018

Uh oh!

spencerkimball commented Jan 8, 2018

Uh oh!

jordanlewis commented Jan 8, 2018

Uh oh!

bdarnell commented Jan 8, 2018

Uh oh!

spencerkimball commented Jan 8, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

spencerkimball commented Jan 7, 2018 •

edited

Loading