-
Notifications
You must be signed in to change notification settings - Fork 661
[oocm] generates an excessive number of os-interface api requests #2089
Description
/kind bug
What happened:
We recently noticed that our nova api is handling a constant excessive amount of os-interface (list) calls.
This in turn puts a lot of strain on our neutron API as os-interface calls gets forwarded by nova to neutron as get_ports API calls.
It turns out this constant high load is almost exclusively caused by various occm instances running in our OpenStack installation.
Looking at the occm code there is five different places where getAttachedInterfacesByID is called: for routes, load balancers and instances.
It seems the use of getAttachedInterfacesByID for the routes controller is the most costly inducing a lot of os_interfaces calls at a high frequency:
For every instance in the project os_interfaces is called at the rate of the routes controller which is 10 seconds by default. This is regardless of the cluster size. We have a cluster with 4 nodes in a project with 1000 instances and in this case occm is generating about 1000 os_interface list calls every 10 seconds.
I would like to start discussing options how the current behaviour could be improved, so that the base load of generated API requests by occm is reduced for bigger setups.
My first suggestion would be:
Instead of discovering node ip addresses in three different control loops independently (load balancers, routes, instances) they should only be fetched from the instance controller that reconciles the node status. All other control loops should only read ip addresses from the node status (k8s API).
What you expected to happen:
That occm generates less constant load on the OpenStack APIs
How to reproduce it:
Create a cluster in a project with a lots of vms and configure oocm to set cloud-routes.
Anything else we need to know?:
Environment:
- openstack-cloud-controller-manager(or other related binary) version: 1.25