Fix handling of "leader changed" errors#11426
Conversation
68c5f91 to
a9e3e5f
Compare
|
@hickeyma @technosophos @mattfarina Folks, @cenkalti tested a bunch of scenarios recreating the problem(s) with etcd with this patch. |
This reverts commit ebc79fa. Signed-off-by: Cenk Alti <cenkalti@gmail.com>
Signed-off-by: Cenk Alti <cenkalti@gmail.com>
b6938a2 to
b5378b3
Compare
|
Hey @technosophos @hickeyma. Bumping up this for visibility. It's a small fix. Can you take a look please? |
|
@hickeyma I tested this fix manually by setting up a custom k8s cluster with 3 etcd nodes and running I checked https://github.com/helm/acceptance-testing project to see If I can write this scenario as an acceptance test but I couldn't find a solution as the clusters are created with |
|
Needs another maintainer approval for merge. |
|
@technosophos @mattfarina Can you take a look please? |
|
@hickeyma feel free to merge whenever you feel good about it. |
|
Hi, Any chance this could be patched to the 3.2.x release. |
Fixes fluxcd/flux2/#4804 by copying the solution used in helm/helm#11426
Fixes fluxcd/flux2/#4804 by copying the solution used in helm/helm#11426 Signed-off-by: Luis Davim <luis.davim@gmail.com>
Fixes fluxcd/flux2/#4804 by copying the solution used in helm/helm#11426 Signed-off-by: Luis Davim <luis.davim@gmail.com>
Fixes fluxcd/flux2/#4804 by copying the solution used in helm/helm#11426 Signed-off-by: Luis Davim <luis.davim@gmail.com>
What this PR does / why we need it:
This PR aims to fix temporary "etcdserver: leader changed" errors from kube-apiserver that is previously attempted by #11401 by adding a single retry when this kind of error is detected.
/cc @dims
Special notes for your reviewer:
I also reverted the previous fix that didn't solve the issue. One commit is the revert, the other one is the new fix. It's better if you review commits separately.
If applicable: