Skip to content

storage: bump LeaseAppliedIndex instead of reproposing #27054

@tbg

Description

@tbg

In #26830, we observed an instance in which the uncommitted portion of the Raft log grew to over 7GB of size, leading to an out-of-memory error due to being loaded into memory on election. This particular out-of-memory issue has been fixed, but not the process by which the uncommitted Raft log grew to 7GB in the first place.

This likely happened because the split queue retried the split repeatedly, each time proposing on the order of 100mb into Raft. This in itself is a problem (that has partially been addressed, see #25233) but a problematic piece of code related to this is the reproposal logic:

https://github.com/cockroachdb/cockroach/blob/master/pkg/storage/replica.go#L4396-L4417

which repeatedly creates copies of proposals that don't commit within a target interval. Consider the case in which Raft commit is significantly delayed or even stalled (lost quorum) -- the mechanism is essentially blowing up the uncommitted tail of the Raft log for no good reason.

This raises the question why we're even trying to repropose these commands. We should simply propose a "no-op" through Raft that allocates a new LeaseAppliedIndex, after which the pending proposals below it are discarded (to be retried externally).

The semantics of that approach seem saner as we never duplicate large commands and as proposers already have to expect retries and are equipped to handle them.

@bdarnell this seems worth doing for 2.1, if you don't have any reservations. The change shouldn't be too big (probably most of the the time spent fixing up the tests that test this through intimate knowledge of the current code).

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kv-replicationRelating to Raft, consensus, and coordination.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions