Skip to content

storage: Don't swallow ErrEpochAlreadyIncremented#36942

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
bdarnell:epoch-already-incremented
Apr 19, 2019
Merged

storage: Don't swallow ErrEpochAlreadyIncremented#36942
craig[bot] merged 1 commit intocockroachdb:masterfrom
bdarnell:epoch-already-incremented

Conversation

@bdarnell
Copy link
Copy Markdown
Contributor

IncrementEpoch was failing to distinguish between the request that
caused the increment and another caller making the same increment.
This is incorrect since a successful increment tells you when the
node was confirmed dead, while relying on another node's increment
leaves this uncertain.

In rare cases involving a badly overloaded cluster, this could result
in inconsistencies (non-serializable transactions) due to incorrect
timestamp cache management.

Fixes #35986

Release note (bug fix): Fix a rare inconsistency that could occur on
badly overloaded clusters.

@bdarnell bdarnell requested review from a team and tbg April 18, 2019 22:12
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While looking at this issue I was surprised how subtle the guarantee that our status.Expiration becomes the base of the liveness CPut is. Can you think of any assertions we could add to make sure this is true?

At the very least, leave a comment on requestLeaseAsync to highlight the role of the status (I don't think we use much of it, so it could be tempting to refactor things there, introducing bugs by accident)

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell)


pkg/storage/replica_range_lease.go, line 311 at r1 (raw file):
The explanation of why we can't just go ahead is awkwardly broken up over these two paragraphs. Consider editing from this line on (and giving another pass for clarity to my text):

without having its epoch incremented, i.e. that it was in fact live at status.Timestamp. It would be incorrect to simply proceed to sending our lease request since our lease.Start may precede the start timestamp of the predecessor lease (i.e. likely that put into place by whoever beat us to the epoch increment), and so under this lease this node's timestamp cache would not necessarily reflect all reads served by the other leaseholder.
...
ErrEpochAlreadyIncremented is not an unusual error to observe, so we don't log it.

@tbg
Copy link
Copy Markdown
Member

tbg commented Apr 19, 2019

CI failed because of a known flake (now skipped): #36948

IncrementEpoch was failing to distinguish between the request that
caused the increment and another caller making the same increment.
This is incorrect since a successful increment tells you *when* the
node was confirmed dead, while relying on another node's increment
leaves this uncertain.

In rare cases involving a badly overloaded cluster, this could result
in inconsistencies (non-serializable transactions) due to incorrect
timestamp cache management.

Fixes cockroachdb#35986

Release note (bug fix): Fix a rare inconsistency that could occur on
badly overloaded clusters.
@bdarnell bdarnell force-pushed the epoch-already-incremented branch from 473f106 to 8de884a Compare April 19, 2019 17:32
@bdarnell
Copy link
Copy Markdown
Contributor Author

I've updated the new comment, using part of your new text (minus the parenthetical about the node who completed the IncrementEpoch having a lease - if that node managed to get a lease, our lease would fail because PrevLease wouldn't match. The problematic case is when the node whose epoch we're trying to increment still has the lease, and the node that wins the race to perform the IncrementEpoch loses the race to RequestLease).

I'm not sure what specifically you're looking for in a requestLeaseAsync comment, but later I'm going to do another pass over this file while it's fresh in my mind to document everything I can.

@bdarnell
Copy link
Copy Markdown
Contributor Author

bors r=tbg

craig bot pushed a commit that referenced this pull request Apr 19, 2019
36942: storage: Don't swallow ErrEpochAlreadyIncremented r=tbg a=bdarnell

IncrementEpoch was failing to distinguish between the request that
caused the increment and another caller making the same increment.
This is incorrect since a successful increment tells you *when* the
node was confirmed dead, while relying on another node's increment
leaves this uncertain.

In rare cases involving a badly overloaded cluster, this could result
in inconsistencies (non-serializable transactions) due to incorrect
timestamp cache management.

Fixes #35986

Release note (bug fix): Fix a rare inconsistency that could occur on
badly overloaded clusters.

Co-authored-by: Ben Darnell <ben@bendarnell.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 19, 2019

Build succeeded

@craig craig bot merged commit 8de884a into cockroachdb:master Apr 19, 2019
@tbg
Copy link
Copy Markdown
Member

tbg commented Apr 19, 2019

Comment looks really good now, thanks!

@bdarnell bdarnell deleted the epoch-already-incremented branch April 21, 2019 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

roachtest: scrub/all-checks/tpcc/w=1000 failed

3 participants