One-pager for CRD scaling discussions in Crossplane by ulucinar · Pull Request #2918 · crossplane/crossplane

ulucinar · 2022-02-21T07:58:52Z

Description of your changes

With this one-pager proposal, we would like to establish a common understanding in the Crossplane community on the issues around CRD scaling, and the paths to possible solutions to these issues.

I have:

Read and followed Crossplane's [contribution process].
Run make reviewable to ensure this PR is ready for review.
Added backport release-x.y labels to auto-backport this PR if necessary.

How has this code been tested

N.A.
[contribution process]: https://git.io/fj2m9

negz

Thanks for working on this @ulucinar! Great start.

negz · 2022-02-25T00:03:51Z

design/one-pager-crd-scaling.md

+* Status: Draft
+
+## Background
+With the release of the [Terrajet](https://github.com/crossplane/terrajet) based providers, the Crossplane community has become more aware of some upstream scaling issues related to custom resource definitions. We did some early analysis such as [[1]] and [[2]] to get a better understanding of these issues and as we will discuss in more detail in the “Issues” section, the broader K8s community has already been aware of especially the client-side throttling problems for some time. It’s also not Crossplane alone. [Azure Service Operator](https://github.com/Azure/azure-service-operator), or [GCP Config Connector](https://github.com/GoogleCloudPlatform/k8s-config-connector) are projects that rely on Kubernetes [custom resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) as an extension mechanism and have many CRDs representing associated Cloud resources. 


Could you please wrap this markdown at 80 chars? In part because it's consistent with our other designs, and in part because it allows more 'targeted' comments in GitHub - currently it's only possible to comment on entire paragraphs (rather than a line within a paragraph) because each paragraph is one long line.

There's a few tools that can do this automatically - I use https://marketplace.visualstudio.com/items?itemName=stkb.rewrap

FWIW I reached out to the ASO and ACK folks and that haven't seen these scaling issues yet, but I fully expect they will in time. ACK is somewhat insulated since they're bundles of smaller controllers, but I believe ASO and KCC take the some "one controller manager for all of a cloud's resources" approach we do.

Thanks Nic for the tool suggestion! It was very helpful. I have also removed most of the inline references.

Dropping by to say: @negz was correct: Azure/azure-service-operator#2920

negz · 2022-02-25T00:06:32Z

design/one-pager-crd-scaling.md

+## Background
+With the release of the [Terrajet](https://github.com/crossplane/terrajet) based providers, the Crossplane community has become more aware of some upstream scaling issues related to custom resource definitions. We did some early analysis such as [[1]] and [[2]] to get a better understanding of these issues and as we will discuss in more detail in the “Issues” section, the broader K8s community has already been aware of especially the client-side throttling problems for some time. It’s also not Crossplane alone. [Azure Service Operator](https://github.com/Azure/azure-service-operator), or [GCP Config Connector](https://github.com/GoogleCloudPlatform/k8s-config-connector) are projects that rely on Kubernetes [custom resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) as an extension mechanism and have many CRDs representing associated Cloud resources. 
+
+Kubernetes is a complex ecosystem with many moving parts and we need a deeper understanding for the issues around scaling in the dimension of the total number of CRDs per-cluster. This dimension is not yet officially considered in the scalability thresholds document [[3]] but it will be with good probability. So, as the Crossplane community, we would like to have our use cases considered in relevant contexts, and we would like to gain a good understanding so that we can:


This dimension is not yet officially considered in the scalability thresholds document [[3]] but it will be with good probability.

I think you asked upstream about this right? Is there an issue tracking getting CRDs added to that doc?

Not yet Nic. It would be good to have Crossplane use cases clarified to have them on the table for those discussions, and probably use them as a guideline in those discussions. Also, as you mentioned above, it would be great to involve ASO and KCC folks. There was a previous community survey asking for CRD use cases and I believe its results were incorporated in the GA scalability targets for CRDs. It looks like projects like Crossplane, ASO, KCC have more demanding new use cases which could be the motivation in those discussions.

negz · 2022-02-25T00:08:54Z

design/one-pager-crd-scaling.md

+
+
+### Client-side Throttling
+`kubectl` maintains a discovery cache for the discovered server-side resources under the (default) filesystem path of `$HOME/.kube/cache/discovery/<host_port>/`. Here, `<host_port>` is a string derived from the API server host and the port number it’s listening on. An example path would be `$HOME/.kube/cache/discovery/exampleaks_8e092dad.hcp.eastus.azmk8s.io_443`, or `$HOME/.kube/cache/discovery/EB788B3B801893B684B4579B2ADF0171.gr7.us_east_1.eks.amazonaws.com`. Under this cache, we have the `servergroups.json` file, which is a JSON-serialized [`v1.APIGroupList`](https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#APIGroupList) object. Thus, the cache file `$HOME/.kube/cache/discovery/<host_port>/servergroups.json` holds all of the discovered API GroupVersions (GVs) together with their preferred versions from that API service. And for each discovered API GroupVersion, we have a `serverresources.json` that caches metadata about the discovered resources under that GroupVersion, JSON-serialized as a [`v1.APIResourceList`](https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#APIResourceList). This metadata about resources is crucial for various tasks, such as: 


I want to note that that while fixing kubectl would be very impactful, we may want to go down a level and try get this fixed in client-go so that all (Go based) clients that do discovery are fixed.

Agreed. Cache busting for the discovery client is under discussion now. Maybe we can extend this one-pager by mentioning some options considered (cache busting, disabling client-side throttling, etc.).

negz · 2022-02-25T00:12:02Z

design/one-pager-crd-scaling.md

+
+## Action Items
+- We can consider cherry-picking [[5]] to all active release branches `v1.23`, `v1.22`, `v1.21`, as the anticipated release date for the `v1.24` release is in April, 2022.
+- Open issues regarding API service disruptions for managed control-planes (GKE regional, AKS, EKS, etc.), where we expect high-availability.


Some folks on today's community meeting (namely @haarchri and Jillian Hill from Guidewire) mentioned that their AWS professional services teams have noticed increase resource usage even when EKS appears to be working well and reached out to investigate. It sounds like in some cases they're also seeing EKS clusters choke when installing providers with a lot of CRDs - one theory is that some regions have smaller EKS control planes.

Either way it sounds like we might be able to reach out either to our own contacts at AWS or through a company in the Crossplane community to get some insight into how EKS is handling this load.

That would be great Nic. The support issue we opened for GKE regional clusters was not fruitful, the outcome was it works as expected. What we wanted to learn was basically the metrics GKE autoscalers are using when deciding to scale their managed control-planes up (CPU/memory usage/utilization of control-plane components, kube-apiserver or other component failures, # of CRDs, or some other SLIs, etc.?). One thing I still wonder is how SLAs come into play as Crossplane users install three big providers in their clusters. @AaronME, do you know if it's possible, in cloud consoles or by some other means (like a support ticket), we can inquiry about these SLAs?

ulucinar · 2022-03-02T08:00:08Z

release-1.23 cherry-pick for the kubectl configuration fix is here: kubernetes/kubernetes#108401

jbw976 · 2022-03-14T16:11:00Z

design/one-pager-crd-scaling.md

+  we should be careful.
+
+
+### Client-side Throttling


is it possible to very clearly state (maybe with a table?) which versions of kubectl are released (or targeting releases) that have all the client-side throttling fixes and we can recommend to the community?

e.g. it's not super obvious to me (as a casual reader) which versions of kubectl (and also k8s-api) i would need to have a good experience with jet providers. Maybe we can make that super obvious at the top of the doc so folks coming to this document just looking for guidance on what versions to use to have a good experience can easily find that info without getting lost in the details. what do you think @ulucinar?

Would an executive summary section about the things we'll do be helpful? Maybe one-liners about the problems and links to the task issues, though that could be duplicate of Action Items

I think it's a good idea @jbw976. Added a kubectl version table with their release dates as a recommendation at the beginning of the Client-side Throttling section.

awesome @ulucinar, that table is really helpful!

do you think we should do the same for server side issues? e.g. so casual readers know with a quick glance what versions of k8s api-server have all the fixes and should perform well? that table could also be valuable :)

Thanks @jbw976. I've also added a table of kube-apiserver versions to the beginning of the API Server Resource Consumption section explicitly mentioning about the OpenAPI v2 spec lazy-marshaling change and the related upstream issues.

that is super helpful @ulucinar, thank you for making this document more accessible for a broader audience!! 🙇‍♂️

turkenh · 2022-03-22T14:48:05Z

design/one-pager-crd-scaling.md

+  issues in kube-apiserver and possibly in other control-plane components.
+- Initiate further discussions with Kubernetes [sig-scalability] community
+  regarding CRD-scalability and bring agreed-upon Crossplane scenarios into
+  their attention.


As we discussed previously, I would like to explore possibilities like an optional lazy serving for CRDs. I would love to discuss its feasibility upstream once we have an upstream issue framing the problem with server-side resource consumption. Until we have that issue where we can comment as a possible solution, copying the proposal here just to capture it somewhere public.

What you guys thinking about proposing an upstream Kubernetes change for lazy serving CRDs?

Introduce an optional lazyServe field to CRD spec next to the existing served field.

Extend existing notFoundHandler in a way that it checks if there is a non-served CRD for the endpoint that was hit and if yes, set served true on that CRD and return 503 (or do some trick, wait a bit and redirect to the same url).

On Crossplane and controller side, we would also need to change how we are starting the controllers. Instead of starting all at once, we would also need to watch CRDs and only start a controller of a type when its CRDs are served.

Assuming most clients already have built-in retries for 503 errors, this could provide the following user experience for Crossplane:

Install providers with any number of CRDs.

No significant load on the system since none of the CRDs served.

A CRD will be started to be served only after the user creates a resource.

It is a bit assertive but it sounds feasible based on my thought experiments so far.

design/one-pager-crd-scaling.md

muvaf · 2022-03-14T17:57:12Z

design/one-pager-crd-scaling.md

+  `kubectl` behaves when there is a large number of CRDs installed in the
+  cluster and how the discovery cache and the discovery client affect the
+  perceived performance of the `kubectl` commands run. 
+- Server-side issues: We also have some observations on the control-plane


Would it make sense to have a short section about the open api aggregation problem that was fixed a few months earlier? Just a summary of the problem and link to the PR.

Hi @muvaf,
Thank you for the comments.
I have extended the paragraph where we discuss this change with more details.

muvaf · 2022-03-14T17:58:14Z

design/one-pager-crd-scaling.md

+  we should be careful.
+
+
+### Client-side Throttling


Would an executive summary section about the things we'll do be helpful? Maybe one-liners about the problems and links to the task issues, though that could be duplicate of Action Items

muvaf · 2022-03-31T07:38:27Z

design/one-pager-crd-scaling.md

+   improves as the burstiness and the fill rate of the token bucket rate limiter
+   are increased in `kubectl@v1.24`.
+2. Installation of multiple providers, such as `provider-jet-{aws,gcp,azure}`
+   preview editions, into the same cluster results in ~370 GVs being served by


If we bump the current burst limit from 300 to 500, do you think this goal would be achieved? If so, I think we can set this as a goal and track bumping that number in client-go and kubectl as an easy-to-get change.

As discussed in the Client-side Throttling section, experiments done with a custom build of kubectl that allows us to specify discovery-client's tbrl parameters reveal that, with tbrl(b=400, r=50.0 qps), client-side throttling is no longer a bottleneck for our current GV counts (despite a delay of ~18s. on a cold cache). This is subject to change if the number of GVs increase because of different combinations of providers or because of API regroupings etc. However, the upstream community sees bumping these limits to cover certain use-cases as increasing the debt ceiling.

muvaf · 2022-03-31T07:48:19Z

design/one-pager-crd-scaling.md

+```
+
+
+## Criteria Set for Ideal State


Can we keep this section short and precise with exact goals and then give details in another section? For example, two main goals are listed, so having their technical definition in one sentence or in a table may make it easier to understand at a first glance. Or example use scenarios could be helpful as well.

I've added a table of benchmark provider installation scenarios summarizing the criteria set under the Criteria Set for Ideal State section.

muvaf · 2022-03-31T07:51:46Z

design/one-pager-crd-scaling.md

+- Initiate further discussions with Kubernetes [sig-scalability] community
+  regarding CRD-scalability and bring agreed-upon Crossplane scenarios into
+  their attention.
+


I think we can add client-go change here, too, as an action item.

I've added a new action item for bumping the burst limit of the default discovery client in client-go.

design/one-pager-crd-scaling.md

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

…ree provider clusters Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

- Remove inline references Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

- Add OpenAPI v2 spec lazy-marshaling fix versions table Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

…the criteria set Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

…to 300 Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

muvaf

Looking great! @ulucinar thank you for tackling such a problem that expands to multiple components in the upstream and tell the story in a consumable way. This is really helpful for both Crossplane and Kubernetes communities!

ulucinar marked this pull request as draft February 21, 2022 07:59

ulucinar force-pushed the fix-2895 branch 3 times, most recently from ad20bcb to e75a7e6 Compare February 21, 2022 14:25

negz reviewed Feb 25, 2022

View reviewed changes

ulucinar force-pushed the fix-2895 branch from 8215991 to 5204abc Compare February 25, 2022 21:46

ulucinar marked this pull request as ready for review February 25, 2022 21:49

muvaf mentioned this pull request Mar 3, 2022

Proposal to merge Jet AWS, Jet Azure and Jet GCP with their classic counterparts #2930

Closed

3 tasks

ulucinar force-pushed the fix-2895 branch 4 times, most recently from 1735f4b to c5986bb Compare March 8, 2022 16:09

ulucinar mentioned this pull request Mar 11, 2022

Proxy Support Issue for CRD-scaling in GKE Regional Clusters #2964

Closed

jbw976 reviewed Mar 14, 2022

View reviewed changes

ulucinar force-pushed the fix-2895 branch 3 times, most recently from 7377cf0 to 47c0ef0 Compare March 15, 2022 15:01

turkenh reviewed Mar 22, 2022

View reviewed changes

muvaf mentioned this pull request Mar 28, 2022

GKE cluster becomes unresponsive when provider installed crossplane-contrib/provider-jet-gcp#39

Open

muvaf reviewed Apr 6, 2022

View reviewed changes

ulucinar mentioned this pull request Apr 13, 2022

Proxy Issue for Upstream Kubernetes Issue to Discuss API Server Observations from #2918 #3042

Closed

ulucinar added 8 commits April 13, 2022 12:17

One-pager for CRD scaling discussions in Crossplane

4b49088

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Extend the scenario & criteria discussions

41a68db

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add kube-apiserver observations and analysis

b4189c5

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add discovery client tbrl parameter exploration table for single & th…

4f134b5

…ree provider clusters Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add summary section

17cf42d

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Wrap lines at 80th column using an editor plugin

fa392ee

- Remove inline references Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add kubectl versions table with tbrl parameters fix

652123b

- Add OpenAPI v2 spec lazy-marshaling fix versions table Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add Crossplane proxy issues for action items

f2a6464

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

ulucinar force-pushed the fix-2895 branch from 47c0ef0 to f2a6464 Compare April 13, 2022 09:17

ulucinar added 3 commits April 21, 2022 20:01

Extend the discussion on OpenAPI v2 spec lazy marshaling

20a0faa

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add a table of benchmark provider installation scenarios summarizing …

db5d0ca

…the criteria set Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add an action item to propose bumping discovery client default burst …

e38d8ba

…to 300 Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

muvaf approved these changes Apr 21, 2022

View reviewed changes

muvaf merged commit b49ddde into crossplane:master Apr 21, 2022

clubanderson mentioned this pull request Mar 22, 2026

🌱 crossplane: Consolidation of CRD Scaling Issues kubestellar/console-kb#1646

Merged



		### Client-side Throttling
		`kubectl` maintains a discovery cache for the discovered server-side resources under the (default) filesystem path of `$HOME/.kube/cache/discovery/<host_port>/`. Here, `<host_port>` is a string derived from the API server host and the port number it’s listening on. An example path would be `$HOME/.kube/cache/discovery/exampleaks_8e092dad.hcp.eastus.azmk8s.io_443`, or `$HOME/.kube/cache/discovery/EB788B3B801893B684B4579B2ADF0171.gr7.us_east_1.eks.amazonaws.com`. Under this cache, we have the `servergroups.json` file, which is a JSON-serialized [`v1.APIGroupList`](https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#APIGroupList) object. Thus, the cache file `$HOME/.kube/cache/discovery/<host_port>/servergroups.json` holds all of the discovered API GroupVersions (GVs) together with their preferred versions from that API service. And for each discovered API GroupVersion, we have a `serverresources.json` that caches metadata about the discovered resources under that GroupVersion, JSON-serialized as a [`v1.APIResourceList`](https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#APIResourceList). This metadata about resources is crucial for various tasks, such as:

Conversation

ulucinar commented Feb 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of your changes

How has this code been tested

Uh oh!

negz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ulucinar commented Mar 2, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

muvaf left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ulucinar commented Feb 21, 2022 •

edited

Loading