Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened: On a cluster with an extension API server (service-catalog in this case), Controller Manager crashes with the following log:
controllermanager.go:155] error starting controllers: failed to discover resources: unable to retrieve the complete list of server APIs: servicecatalog.k8s.io/v1beta1: an error on the server ("missing route (no endpoints available for service "catalog-catalog-apiserver")") has prevented the request from succeeding
What you expected to happen: It should treat these as transient errors and should not return error to the caller (which is main controller-manager).
How to reproduce it (as minimally and precisely as possible):
Don't really know if it is deterministically reproducible, but a mis-configured (or buggy) extension API server can cause this.
Anything else we need to know?:
We ran in to similar issue in 1.7 release and it was fixed in this PR https://github.com/kubernetes/kubernetes/pull/49767/files. The change allowed discovery-resources error in controller-manager. So as long as any resource was returned by discovery API, controller-manager would treat it as transient error.
However in 1.9, one of the controllers (resource-quota) was refactored to accept discoveryFn as input and error handling in discoveryFn doesn't handle the transient errors (which were introduced in 1.7). https://github.com/kubernetes/kubernetes/blob/release-1.9/pkg/controller/resourcequota/resource_quota_controller.go#L468
Environment:
- Kubernetes version (use
kubectl version): 1.9
- Cloud provider or hardware configuration: GKE
- OS (e.g. from /etc/os-release):
- Kernel (e.g.
uname -a):
- Install tools:
- Others:
Is this a BUG REPORT or FEATURE REQUEST?:
What happened: On a cluster with an extension API server (service-catalog in this case), Controller Manager crashes with the following log:
controllermanager.go:155] error starting controllers: failed to discover resources: unable to retrieve the complete list of server APIs: servicecatalog.k8s.io/v1beta1: an error on the server ("missing route (no endpoints available for service "catalog-catalog-apiserver")") has prevented the request from succeeding
What you expected to happen: It should treat these as transient errors and should not return error to the caller (which is main controller-manager).
How to reproduce it (as minimally and precisely as possible):
Don't really know if it is deterministically reproducible, but a mis-configured (or buggy) extension API server can cause this.
Anything else we need to know?:
We ran in to similar issue in 1.7 release and it was fixed in this PR https://github.com/kubernetes/kubernetes/pull/49767/files. The change allowed discovery-resources error in controller-manager. So as long as any resource was returned by discovery API, controller-manager would treat it as transient error.
However in 1.9, one of the controllers (resource-quota) was refactored to accept discoveryFn as input and error handling in discoveryFn doesn't handle the transient errors (which were introduced in 1.7). https://github.com/kubernetes/kubernetes/blob/release-1.9/pkg/controller/resourcequota/resource_quota_controller.go#L468
Environment:
kubectl version): 1.9uname -a):