-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage: Handle raft.ErrProposalDropped #21849
Description
etcd-io/etcd#9067 introduced a new error ErrProposalDropped. Properly handling this error will allow us to reduce the occurrence of ambiguous failures (Replica.executeWriteBatch doesn't need to return an AmbiguousResultError if it has not successfully proposed), and may allow us to be more intelligent in our raft-level retries.
This error alone does not allow us to eliminate time-based reproposals because raft's MsgProp forwarding is fire-and-forget. However, if we disabled raft-level forwarding and did our own forwarding, I think we could make the retry logic more deterministic (or at least hide the timing elements in the RPC layer).
Note that in typical usage of raft, you'd respond to this error by passing it up the stack to a layer that can try again on a different replica. We can't do that because of our leases - until the lease expires, no other node could successfully handle the request, so we have to just wait and retry on the lease holder. (we might be able to use this to make lease requests themselves fail-fast, though).
Jira issue: CRDB-5872