clustermesh: remove hostAliases and replace it by a custom dialer#43767
clustermesh: remove hostAliases and replace it by a custom dialer#43767MrFreezeex wants to merge 4 commits intocilium:mainfrom
Conversation
|
/test |
8a7fc56 to
23a1484
Compare
|
/test |
23a1484 to
e2400cd
Compare
|
/test |
e2400cd to
8e9ccec
Compare
|
/test |
8d3723f to
b4f67a3
Compare
|
/test |
b4f67a3 to
3dc5f01
Compare
|
/test |
3dc5f01 to
2707231
Compare
|
/test |
2707231 to
fa3af85
Compare
|
/test |
|
I went with a slightly different approach than what we were discussing in the issue by using a dialer instead of something resolver related. It looks easier and that way I can have something really close to what we would have with the normal dialer. Let me know if that looks right and/or if you have some other suggestions 👀 |
| // staticDbgDialer wraps an kvstore.EtcdDbgDialer with static host resolution support | ||
| // similarly to dial.NewStaticHostDialer used to contact remote clustermesh-apiserver | ||
| // but with a kvstore.EtcdDbgDialer interface. | ||
| type staticDbgDialer struct { |
There was a problem hiding this comment.
Since the only underlying method used by this new type is .LookupIP(), it seems that this could have a more generic name as well, and be moved to it's own file. Something like staticDialerWithFallback.
There was a problem hiding this comment.
Hmm the "Dbg" reference the underlying interface EtcdDbgDialer so I am not sure it can't be much more generic name wise, I completed the name a bit by adding WithFallback like you suggested and making it EtcdDbg instead of just Dbg though.
Theoretically it could live in pkg/kvstore/etcd_debug.go alongside the other etcd dbg dialer but importing github.com/cilium/cilium/pkg/dial produce an import cycle there :(. So it probably is simpler to keep it in this package since it's the only consumer and don't produce an import cycle. I could do a separate file in this package if you think that's better though?
5d19e83 to
250208f
Compare
|
/test |
|
Sorry for the long time before making this ready again, I got busy with other things but it should be finally ready again/most comments have been addressed! |
250208f to
73d1353
Compare
|
/test |
giorio94
left a comment
There was a problem hiding this comment.
Thanks, and sorry for the delay. A few more comments inline.
install/kubernetes/cilium/templates/clustermesh-config/_helpers.tpl
Outdated
Show resolved
Hide resolved
35b6b0b to
7a6daed
Compare
This comment was marked as outdated.
This comment was marked as outdated.
|
/test EDIT: note that the push right after this comment is a rebase to fix some conflict on the CI and the helm template from the clustermesh-apiserver/deployment.yaml) |
This commit add a Cilium config struct parsed from the etcd config to connect to each remote clustermesh-apiserver. These fields are ignored by etcd which ignore any unknown fields (like most go program do) and will be used in a future commit to replace the host aliases on Pods which forces restarts when changed. Signed-off-by: Arthur Outhenin-Chalandre <git@mrfreezeex.fr>
7a6daed to
b803317
Compare
Add a new static dialier and wire it in the clustermesh code with the config that was added in the previous commit. This new static dialer attempt to be very close to the net.Dialer with very similar code and partially the same behavior (it has to be a bit different since we don't resolve anything though). It only match one single hostname and will fallback to another dialer so that we can just chain dialer like the one we are already using. This dialer will attempt every IPs and if the context has a deadline set it will have partial distribution logic which essentially boils down to dividing the remaining deadline from the original context with the number of remaining IPs to attempt connection to. This logic is directly taken from the net.Dialer logic. Note that to my knowledge we don't have any deadline set in the ClusterMesh code though. Signed-off-by: Arthur Outhenin-Chalandre <git@mrfreezeex.fr>
b803317 to
2c316ea
Compare
Add the cilium-host-aliases in the etcd config which should be fully replacing hostaliases in Cilium 1.21. We are keeping hostAliases for now to keep downgrade safe (secrets updated without cilium-clustermesh-apiserver-host and potentially without hostAliases if we were already removing it here). We are force disabling this in the conformance ClusterMesh CI via a non documented helm values to already exercise that this is working properly while the conformance upgrade CI stays on default setting to keep the existing flow tested! Signed-off-by: Arthur Outhenin-Chalandre <git@mrfreezeex.fr>
Previous commits added a new static dialer in order to replace hostAliases. This commit now wires this new static dialer to the clustermesh troubleshoot command. Signed-off-by: Arthur Outhenin-Chalandre <git@mrfreezeex.fr>
2c316ea to
a1a4177
Compare
|
/test |
a1a4177 to
9f6f6ca
Compare
|
Sorry I hate to do this to you, I was initially going to just do another test branch on cilium/cilium repo with the same code as I am now modifying some CI file to better test this new change (and I didn't anticipated that so this PR is open from my fork) but I ended up fighting the CI (more accurately my own mistakes) here so I don't think the diff are super readable anyway unfortunately, so I will close this PR to replace it with the one on the cilium/cilium repo directly here: #44425. The differences are mostly to incorporate all the nice suggestion from latest reviews of @giorio94. |
|
No problem! |
See each commit
Fixes #42716