[New Package] Add Kubelet Stats OpenTelemetry input package#17159
[New Package] Add Kubelet Stats OpenTelemetry input package#17159
Conversation
This adds a new input package for collecting Kubernetes metrics from the Kubelet API using the OpenTelemetry kubeletstatsreceiver. Features: - Collects node, pod, container, and volume metrics - Multiple authentication methods (serviceAccount, TLS, kubeConfig, none) - Configurable metric groups and collection intervals - Support for extra metadata labels - Network interface collection configuration Closes #17140 Co-authored-by: Kyle Pollich <kpollich@users.noreply.github.com>
|
Pinging @elastic/ecosystem (Team:Ecosystem) |
|
@claude please do a thorough review -- pay careful attention to the receiver upstream and at our other input packages |
|
Claude finished @kpollich's task in 3m 5s —— View job Review CompleteI've submitted a detailed review of this PR. Key findings: ✅ Looks Good:
💡 Non-blocking Suggestions:
The package is well-implemented and ready for merge. See the full review for details. Why is Claude responding? | Type Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
There was a problem hiding this comment.
PR Review: Kubelet Stats OpenTelemetry Input Package
Overall this is a well-structured package that follows the existing patterns in the repository. The package passes elastic-package check validation. I've compared it against the upstream kubeletstatsreceiver and other input packages in this repo. Below are my findings.
🟢 Looks Good
- Package structure follows the established pattern (LICENSE.txt, manifest.yml, docs, _dev/test/policy, agent/input, img)
- Configuration coverage is comprehensive - supports all four auth types (none, tls, serviceAccount, kubeConfig)
- Handlebars template correctly handles conditional rendering for optional fields
- Documentation is thorough with troubleshooting tips and RBAC examples
- Test policy validates the expected output structure
🟡 Suggestions (Non-blocking)
1. insecure_skip_verify conditional may output false when not wanted
In input.yml.hbs:16-18, the current pattern:
Since insecure_skip_verify defaults to false in manifest.yml, this conditional will be falsy and won't render. This is actually the desired behavior! However, if a user explicitly sets it to true, it will render correctly. Just noting this is working as expected.
2. Consider using type: duration for interval fields
The nginx_input_otel package uses type: duration for collection_interval and initial_delay, which provides validation. This package uses type: text. Either approach works, but duration type provides additional validation.
manifest.yml:39-45:
- name: collection_interval
type: text # Could be: type: duration
...3. RBAC documentation may need nodes/proxy permission
According to upstream docs, when using extra_metadata_labels (like container.id) or node utilization metrics, the RBAC requires access to nodes/proxy in addition to nodes/stats. Consider expanding the RBAC example in docs/README.md:
rules:
- apiGroups: [""]
resources: ["nodes/stats", "nodes/proxy"] # nodes/proxy needed for extra_metadata_labels
verbs: ["get"]4. Missing tls auth type in k8s_api_config options
The main auth_type field has four options (none, tls, serviceAccount, kubeConfig), but k8s_api_config_auth_type only has three options (missing tls). Based on upstream docs, the K8s API config supports the same auth types.
manifest.yml:119-131:
- name: k8s_api_config_auth_type
type: select
options:
- value: none
text: None
- value: serviceAccount
text: Service Account
- value: kubeConfig
text: Kube Config
# Consider adding:
# - value: tls
# text: TLS5. Endpoint default may need adjustment for auth_type: none
The default endpoint is https://localhost:10250 (secure port), but the description mentions port 10255 is used for auth_type: none (read-only). Users selecting none auth would need to manually change the endpoint and protocol to http://localhost:10255.
Consider adding guidance in the description or providing a more explicit warning in the documentation.
⚪ Minor Observations
- The logo SVG looks like a Kubernetes helm logo - which is appropriate for kubelet stats
- Changelog properly links to the originating issue #17140
- The
_dev/build/docs/README.mdanddocs/README.mdfiles are identical (expected for packages without template variables)
Summary
This is a solid contribution that correctly implements the kubeletstats receiver. The suggestions above are all non-blocking improvements. The package is ready for merge from my perspective, though you may want to address the tls auth type omission in k8s_api_config if upstream supports it.
🤖 Review performed by Claude | View job run
There was a problem hiding this comment.
This is a well-structured new OTel input package for Kubelet Stats. The package follows the established patterns for input packages in this repository.
The template correctly handles the complex conditional logic for network interface collection and TLS configuration. The policy test validates the default configuration correctly.
One minor documentation improvement suggested (LOW severity, non-blocking).
Why is Claude responding? | Type @claude to interact further
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
There was a problem hiding this comment.
This PR adds a well-structured OpenTelemetry input package for collecting Kubelet stats. The implementation follows established patterns used by other OTel input packages in this repository.
The package correctly:
- Defines the kubeletstatsreceiver configuration with appropriate variables
- Supports multiple authentication methods (serviceAccount, TLS, kubeConfig, none)
- Includes proper policy tests
- Provides comprehensive documentation
Two low-severity suggestions are provided as inline comments:
- The endpoint description could be clarified regarding the port/protocol differences
- The build template should include the generatedHeader directive for consistency
CI is passing and the package builds successfully.
Why is Claude responding? | Type @claude to interact further
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
- Changed `collection_interval` type from text to duration and updated default value from 10s to 20s. - Updated `initial_delay` type from text to duration. - Enhanced `auth_type` description to clarify behavior when set to 'none'. - Added 'TLS' option to the authentication type selection. - Expanded README to include details on required RBAC permissions and usage of extra metadata labels. These changes improve the clarity and functionality of the Kubelet Stats receiver configuration.
…onfiguration - Updated README to include details on Kubernetes API authentication options and node utilization metrics. - Modified input template to support optional context for kubeConfig authentication. - Added new configuration options for Kubernetes API CA file, client certificate, private key, and TLS verification settings. - Introduced metrics for container and pod CPU and memory utilization as part of node metrics. These changes improve the usability and flexibility of the Kubelet Stats receiver configuration.
There was a problem hiding this comment.
The package has a build issue that causes elastic-package lint to fail:
Primary Issue: The docs/README.md file is out of sync with the _dev/build/docs/README.md template. It appears that the docs/README.md was manually edited with additional content (extra metadata labels section, Kubernetes API configuration, node utilization metrics, kubeConfig authentication example) instead of being generated from the template.
Resolution: Either:
- Copy the additional documentation content from
docs/README.mdto_dev/build/docs/README.md, then runelastic-package buildto regeneratedocs/README.md, OR - Delete
docs/README.mdand runelastic-package buildto regenerate it from the template (if the simpler template documentation is acceptable)
After resolving, verify with elastic-package lint that the package passes validation.
Why is Claude responding? | Type @claude to interact further
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
|
@claude Please give this PR another thorough review. |
|
Claude finished @kpollich's task in 1m 41s —— View job Reviewing PR
I've submitted my review requesting changes. The main issue is that Why is Claude responding? | Type Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
There was a problem hiding this comment.
The package structure and configuration coverage are good, but there is a blocking issue that must be fixed before merge:
Must Fix:
The docs/README.md file is out of sync with _dev/build/docs/README.md. The docs file contains additional content (extra metadata labels section, Kubernetes API configuration, node utilization metrics documentation) that was manually added instead of being added to the build template. This causes elastic-package lint to fail.
To resolve: Copy the additional documentation content from docs/README.md to _dev/build/docs/README.md, then run elastic-package build to regenerate docs/README.md from the template.
Minor Observation:
The package has been updated to use type: duration for interval fields (which is good), and the manifest now includes the tls option in the k8s_api_config_auth_type select field. These improvements from earlier feedback have been addressed.
Why is Claude responding? | Type @claude to interact further
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
There was a problem hiding this comment.
The package follows established patterns for OTel input packages and is well-structured. The template conditionals for the optional metrics block (lines 64-104 in input.yml.hbs) are complex but functionally correct. Configuration options appropriately mirror the upstream kubeletstatsreceiver options. No security, correctness, or significant quality issues identified.
Why is Claude responding? | Type @claude to interact further
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
There was a problem hiding this comment.
This PR adds a well-structured OpenTelemetry input package for collecting Kubernetes metrics from the Kubelet API.
The package follows established patterns for OTel input packages in this repository:
- Correct manifest structure with appropriate variables for authentication (serviceAccount, TLS, kubeConfig, none)
- Proper Handlebars template for generating OTel collector configuration
- Policy tests that validate the expected output
- Comprehensive documentation covering configuration, RBAC requirements, and troubleshooting
No blocking issues identified.
The code passes linting and builds successfully. The complex conditional logic in the input template for optional metrics is correctly implemented, though somewhat verbose. The TLS configuration follows the upstream kubeletstatsreceiver format.
Why is Claude responding? | Type @claude to interact further
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
|
@claude can you look at all the review comments and resolve them if they're already taken care of? |
|
Claude finished @strawgate's task in 1m 24s —— View job Checking Review Comments
Summary: Resolved 3 threads that were already addressed:
5 threads remain unresolved (not yet addressed):
Why is Claude responding? | Type Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
There was a problem hiding this comment.
Would it be posible to add a system test here ?
Looking at the current kubernetes integration package seems that there are some datastreams there that are testing kubelet too:
https://github.com/elastic/integrations/blob/a6223f4e9fb317405acf5fdc8e04b88941ce5594/packages/kubernetes/_dev/build/docs/kubelet.md
As a context, these tests spin up a K8s cluster via kind.
IIUC that package, it would be adding the _dev/deploy/k8s folder with just an .empty file (to force using the k8s deployer). With that it would be needed to add a system test.
This would be the system test for that datastream in the kubernetes integration package: https://github.com/elastic/integrations/blob/a6223f4e9fb317405acf5fdc8e04b88941ce5594/packages/kubernetes/data_stream/container/_dev/test/system/test-default-config.yml
There was a problem hiding this comment.
Reading the documentation of the receiver, it looks like it is required to create some resources to be able to receive these stats metrics:
The kubernetes resources needed can be defined in YAML files under the _dev/deploy/k8s/ folder.
|
Claude finished @kpollich's task in 8m 43s —— View job Adding System Tests for Kubelet Stats Package
Summary:
The tests have been verified to pass locally via Commit: e27647f Why is Claude responding? | Type Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
This adds system tests that run against a Kubernetes cluster (via kind) to validate that the package can collect kubelet stats metrics. Changes: - Add _dev/deploy/k8s/.empty to enable k8s deployer - Add _dev/test/system/test-default-config.yml with test configuration Co-authored-by: Kyle Pollich <kpollich@users.noreply.github.com>
|
@claude Please create detailed manual testing instructions for the |
|
Claude finished @kpollich's task in 2m 39s —— View job Creating Manual Testing Instructions
Summary: I've added comprehensive manual testing instructions to the PR description. The instructions cover:
The instructions should enable a reviewer to follow along and verify that the package correctly ingests kubelet stats metrics from a local Kubernetes cluster. Why is Claude responding? | Type Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Updated sample_event json file and ensure that at least 50 documents are ingested into Elasticsearch in system tests.
There was a problem hiding this comment.
The test file test-k8s-api-config.expected has a component ID mismatch that will cause policy tests to fail. The processor is defined as transform/componentid-1 but the pipeline (line 62) references transform/componentid-0.
Why is Claude responding? | Type @claude to interact further
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
packages/kubeletstats_input_otel/_dev/test/policy/test-k8s-api-config.expected
Outdated
Show resolved
Hide resolved
0510501 to
7118e4c
Compare
packages/kubeletstats_input_otel/_dev/test/policy/test-environment-vars.yml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
This is a well-structured input package that follows established patterns from similar OTel packages in the repository.
Unresolved Discussion Items:
Several open discussion threads need author/reviewer decisions before merge:
-
Endpoint default value (line 36-38): The current default
https://localhost:10250requires users to always override it. Consider using an environment variable default likehttps://${env:K8S_NODE_IP}:10250per the discussion thread, which would work out-of-the-box in typical Kubernetes deployments. -
metric_groups default (line 100-108): While @mrodm noted upstream defaults cover this, having explicit defaults in the manifest improves discoverability and documentation.
-
metrics YAML field (line 145-160): The field was added for advanced configuration. Awaiting decision on whether to keep it in the initial release.
No blocking technical issues were identified in this review.
Why is Claude responding? | Type @claude to interact further
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
💚 Build Succeeded
History
|
|
Package kubeletstats_input_otel - 0.1.0 containing this change is available at https://epr.elastic.co/package/kubeletstats_input_otel/0.1.0/ |
…17159) Add new input package for collecting Kubernetes metrics from the Kubelet API using the OpenTelemetry kubeletstatsreceiver.
…17159) Add new input package for collecting Kubernetes metrics from the Kubelet API using the OpenTelemetry kubeletstatsreceiver.
Proposed commit message
This PR adds a new input package for collecting Kubernetes metrics from the Kubelet API using the OpenTelemetry kubeletstatsreceiver.
Summary
Checklist
changelog.ymlfile.I have verified that any added dashboard complies with Kibana's Dashboard good practicesRelated issues
Screenshots
How to test this PR locally
These instructions guide you through manually testing the
kubeletstats_otelinput package. You will set up a local Kubernetes cluster, install the package via Fleet, and verify that kubelet stats metrics are being ingested.Prerequisites
Ensure you have the following installed:
Docker - For running kind and the Elastic Stack
kind - Kubernetes in Docker (install guide)
kubectl - Kubernetes CLI (install guide)
elastic-package - For building and serving the package locally
Step 1: Start the Elastic Stack
Start a local Elastic Stack using elastic-package:
The stack will be accessible at:
elastic/changemeStep 2: Build and Install the Package
Build and install the kubeletstats_otel package into the local Kibana:
Step 3: Create a Kubernetes Cluster with kind
Create a kind cluster that will provide the Kubelet API for metrics collection:
Step 4: Connect kubernetes cluster (kind) to elastic-stack
Connect the control plane container to the docker network where it is running the Elastic stack.
NOTE: If you have run system tests previously, the
docker network connectcommand could fail.Step 5: Deploy Elastic Agent in the Kubernetes Cluster
5a. Create the Agent Policy in Fleet
elastic/changeme)kubeletstats-test-policyand click Create agent policy5b. Add the Kubelet Stats Integration
https://${env.NODE_IP}:10250(Kubelet secure port)Service Account(recommended for in-cluster)10s(for faster feedback during testing)5c. Create RBAC for Elastic Agent
Create the necessary RBAC permissions for the Elastic Agent to access kubelet stats:
NOTE: Most of these permissions are already in place by the manifest provided by Kibana. It just misses the
getpermission fornodes/proxy.5d. Deploy Elastic Agent as a DaemonSet
For quick testing, you can modify the manifest to:
elastic-agentservice account created aboveNODE_IPenvironment variable from the downward APITo the manifest provided by Kibana, it needs to be added the CA used by elastic-package.
For that, include the following definitions:
Step 6: Verify Data Ingestion
6a. Check Agent Status
In Kibana Fleet, verify the agent is healthy:
6b. Query for Kubelet Stats Metrics
In Kibana Dev Tools, run the following query to verify metrics are being ingested:
You should see metrics like:
k8s.node.cpu.utilizationk8s.node.memory.usagek8s.pod.cpu.utilizationk8s.pod.memory.usagek8s.container.cpu.utilizationk8s.container.memory.usage6c. Explore in Discover
metrics-*Step 7: Run Automated System Tests
You can also run the automated system tests included with the package:
Cleanup
Troubleshooting
Agent not enrolling:
docker logs elastic-package-stack-fleet-server-1No metrics appearing:
kubectl logs -l app=elastic-agentkubectl auth can-i get nodes/stats --as=system:serviceaccount:default:elastic-agentinsecure_skip_verifyis enabled (kind uses self-signed certs)Connection refused to Kubelet:
NODE_IPenvironment variable is correctly setGenerated with Claude Code