occm: implement a support for atomic routes update#2134
occm: implement a support for atomic routes update#2134k8s-ci-robot merged 2 commits intokubernetes:masterfrom
Conversation
|
Hi @kayrus. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/ok-to-test |
|
@kayrus: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
362b0cc to
3c0444e
Compare
pkg/openstack/openstack.go
Outdated
| // RouterOpts is used for Neutron routes | ||
| type RouterOpts struct { | ||
| RouterID string `gcfg:"router-id"` // required | ||
| RouterID string `gcfg:"router-id"` // required |
There was a problem hiding this comment.
in my opinion this required should be removed because its not really required. Like mentioned in another PR - we have been using OCCM for 7 years without using this at all.
|
please rebase this after that another PR is merged |
|
This is still WIP. I'm in contact with our neutron team. I'd like to remove the manual config toggle, based on extensions API. So there should be no need for an extra option. |
|
@kayrus if you have time could you explain why this routes feature is needed? Our architecture has always been that we have router in (project) network with dhcp and router will handle the traffic automatically, no need to add routes. Could we get some benefits of using it? |
k8s has a global PODs CIDR, and each node serves its own segment, e.g. Therefore VMs or Loadbalancers from the same private network can access a pod's network using static routes:
This approach is used in GCP, AWS, Azure and other providers as well. Basically it adds an ability for loadbalancers to access PODs network directly avoiding nodePorts. Initially there was no atomic routes update, and routes were updated with the following logic: get all routes from a router, append a new route to the list of existing routes, update the router. And this logic worked in concurrent manner, thus there were a lot of inconsistent updates, which overwrote routes, submitted by parallel threads. With an atomic update it should be possible to update routes in concurrent manner and reduce an amount of API calls. Which should significantly reduce the already improved route update logic. Currently I'm trying to figure out whether it's possible to get |
|
@zetaab I updated my comment above. See the bold text. |
|
|
||
| * `router-id` | ||
| Required. Specifies the Neutron router ID to manage Kubernetes cluster routes. | ||
| * `extraroute-atomic` |
There was a problem hiding this comment.
maybe in the future we can check through neutron ext-list to detect whether it's enabled or not ?
There was a problem hiding this comment.
that's what I'm investigating right now.
There was a problem hiding this comment.
done. my tests show overall routes reconciliation latency improvements from 2-5 minutes down to 10 seconds for a cluster of 16 nodes.
ready for review.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zetaab The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
An extra improvement for the route reconciliation logic. This PR allows to use a single API call to add or remove routes instead of fetching existing routes and combining them with new or removing old routes with a further update. This change allows concurrent calls without consistency issues.
Which issue this PR fixes(if applicable):
fixes #2089
based on #2090
Special notes for reviewers:
Release note: