Backround
The latest issues like 3863, 3991 and 4081, proved that the installation of the default configuration of Elastic Agent with our Kubernetes Integration can lead to situations were our customers result in unfortunate circumstances (even with broken k8s clusters sometimes). There are many details and variables that affect the final setup and installation of our observability solution and we can try to summarise and list them here.
Goals
This issue tries to summarise the next actions we need in order to investigate:
- The current resource consumption of default Elastic Agent with K8s Integration
- Several alternative ways that we can offer in order to minimise the impact in different k8s environments and customer setups, regarding resource consumption of k8s cluster.
Actions
Current Actions
We have observed until now that:
a) Memory consumption of Elastic Agent had increased from 8.8 to 8.9 versions and later of Elastic Agent (Relevant https://github.com/elastic/sdh-beats/issues/3863#issuecomment-1733750863)
b) Number of API calls towards Kubernetes Control API has increased since 8.9 version (See Salesforce 01507229 regarding Elastic Agent overloading Kubernetes API server.: https://github.com/elastic/sdh-beats/issues/3991#issuecomment-1787648161)
c) CPU consumption (although not such a big issue at the moment and not first priority) has been referred here as a concern.
Unti now:
- Since 8.11 we have updated the elastic-agent-autodiscover, beats PR to v0.6.4. Disabling metadata for deployment and cronjob. Pods that will be created from deployments or cronjobs will not have the extra metadata field for kubernetes.deployment or kubernetes.cronjob.
- We have merged leader election configuration variables
- Proposing a way to disable Leader Election in Managed Elastic Agents (See here)
Next Planned Actions
Future Plans/Actions
Backround
The latest issues like 3863, 3991 and 4081, proved that the installation of the default configuration of Elastic Agent with our Kubernetes Integration can lead to situations were our customers result in unfortunate circumstances (even with broken k8s clusters sometimes). There are many details and variables that affect the final setup and installation of our observability solution and we can try to summarise and list them here.
Goals
This issue tries to summarise the next actions we need in order to investigate:
Actions
Current Actions
We have observed until now that:
a) Memory consumption of Elastic Agent had increased from 8.8 to 8.9 versions and later of Elastic Agent (Relevant https://github.com/elastic/sdh-beats/issues/3863#issuecomment-1733750863)
b) Number of API calls towards Kubernetes Control API has increased since 8.9 version (See Salesforce 01507229 regarding Elastic Agent overloading Kubernetes API server.: https://github.com/elastic/sdh-beats/issues/3991#issuecomment-1787648161)
c) CPU consumption (although not such a big issue at the moment and not first priority) has been referred here as a concern.
Unti now:
Next Planned Actions
Future Plans/Actions