-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage: learner snap can fail when interacting with raft snap #40207
Description
When we add a learner, we currently send a snapshot that can race with the raft snapshot queue's. Typically we win (the snap queue has code that skips the snapshot if we're already "starting" to send one from the upreplication code) but if the queue wins, the result can be that the explicit snap gets an error due to an overlapping reservation (see the gist below).
This isn't triggered very frequently on master (at the time of writing) due to a buglet in Raft which was since fixed in etcd-io/etcd#11037. As a result, to bump Raft past that fix, we need to address this issue first. Since we expect to pick up a fix or two in Raft as joint consensus matures, we want to address this sooner rather than later. OTOH it's such a small commit that it's easy enough to switch to a fork if we need to do so temporarily.
https://gist.github.com/tbg/fd882caa6ed72e7af0cbd7c8fb0c3504
Unfortunately no obvious/trivial solution to the race presents itself. It's open whether we want to perhaps start relying on the raft snap queue to catch up the follower (i.e. poll the follower replica manually via a newly added RPC) or whether we want to prevent the snap queue from ever catching up learners (which we used to do for a short amount of time until some problem with that became apparent).