Skip to content

cri: Add background stats collector to calculate UsageNanoCores#12629

Merged
mikebrow merged 4 commits into
containerd:mainfrom
dims:updates-for-better-cri-resource-metrics
Dec 9, 2025
Merged

cri: Add background stats collector to calculate UsageNanoCores#12629
mikebrow merged 4 commits into
containerd:mainfrom
dims:updates-for-better-cri-resource-metrics

Conversation

@dims

@dims dims commented Dec 5, 2025

Copy link
Copy Markdown
Member

adds a background stats collector that calculates UsageNanoCores for containers and pod sandboxes.

  • run in the background every second to collect CPU metrics for all containers and sandboxes (similar to what cAdvisor does)
  • keep a rolling buffer of CPU samples and calculates the instantaneous CPU usage rate from consecutive samples
  • read pod-level CPU stats from the parent cgroup rather than the pause container
  • add cgroupv2 Pressure Stall Information for CPU, memory, and IO
  • add missing Timestamp and Interfaces fields

when Kubernetes runs with PodAndContainerStatsFromCRI=true, it expects UsageNanoCores to be set in stats responses. This value represents how much CPU is being used right now (as opposed to UsageCoreNanoSeconds which is cumulative). To calculate it, we need to compare CPU samples over time to replicate what is in cadvisor.

we can't yet really test this in CI as some changes in kubernetes has to land for --feature-gates=PodAndContainerStatsFromCRI=true

PS: see #12620 where we tested against a forked branch to get things working

Add background stats collector to calculate UsageNanoCores for containers and pod sandboxes

@github-project-automation github-project-automation Bot moved this to Needs Triage in Pull Request Review Dec 5, 2025
@dosubot dosubot Bot added area/cri Container Runtime Interface (CRI) kind/feature labels Dec 5, 2025
@dims

dims commented Dec 5, 2025

Copy link
Copy Markdown
Member Author

cc @mikebrow @akhilerm

@dims

dims commented Dec 5, 2025

Copy link
Copy Markdown
Member Author

xref: kubernetes/kubernetes#135604

@dims dims force-pushed the updates-for-better-cri-resource-metrics branch 2 times, most recently from b7f4169 to cd3878c Compare December 5, 2025 13:30
@dims

dims commented Dec 5, 2025

Copy link
Copy Markdown
Member Author

@akhilerm @mikebrow please review when you get a chance. I have not yet turned on the feature flags to allow this to be tested as that needs kubernetes/kubernetes#135604 to land first after k8s code freeze. Once you are happy we can land this one to prepare for it (chicken/egg!)

@mikebrow mikebrow left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! see comments..

once we have "switched" to timed for a container.. do we need to/should we flush the original old method ContainerStats by setting it back to nil

this might be pretty expensive to collect and stage. there is a risk that ticker gets backed up due to locks in container store list and callouts to task metrics for the set of all k8s.io container ids....

Comment thread internal/cri/server/stats_collector.go
Comment thread internal/cri/server/stats_collector.go
Comment thread internal/cri/server/container_stats_list.go
Comment thread internal/cri/server/stats_collector.go
Comment thread internal/cri/server/stats_collector.go
Comment thread internal/cri/server/service.go Outdated
Comment thread internal/cri/server/container_remove.go Outdated
Comment thread internal/cri/server/container_create.go Outdated
Comment thread internal/cri/config/config.go Outdated
@dims dims force-pushed the updates-for-better-cri-resource-metrics branch from 3bbc717 to 7606b6d Compare December 5, 2025 19:55
@mikebrow

mikebrow commented Dec 5, 2025

Copy link
Copy Markdown
Member

cc @fuweid

@dims dims force-pushed the updates-for-better-cri-resource-metrics branch 2 times, most recently from c60bf6c to 4373a80 Compare December 6, 2025 13:01
Comment thread .github/workflows/node-e2e.yml
Comment thread .github/workflows/node-e2e.yml
Comment thread .github/workflows/node-e2e.yml

@mikebrow mikebrow left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
needs rebase to pickup fix for the fuzzer setup

@dims dims force-pushed the updates-for-better-cri-resource-metrics branch from 4373a80 to 0113ff9 Compare December 6, 2025 22:00

@akhilerm akhilerm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a few questions and comments.

Comment thread docs/cri/config.md Outdated
Comment thread .github/workflows/node-e2e.yml
Comment thread internal/cri/server/sandbox_stats_linux.go Outdated
Comment thread internal/cri/server/stats_collector.go Outdated
@dims dims force-pushed the updates-for-better-cri-resource-metrics branch 2 times, most recently from 80338c0 to d9a2c14 Compare December 8, 2025 16:28
@dims

dims commented Dec 8, 2025

Copy link
Copy Markdown
Member Author

@akhilerm fixed up all the things you pointed out. please peek again.

dims added 4 commits December 8, 2025 16:03
adds a background stats collector that calculates `UsageNanoCores` for containers and pod sandboxes.

- run in the background every second to collect CPU metrics for all containers and sandboxes (similar to what cAdvisor does)
- keep a rolling buffer of CPU samples and calculates the instantaneous CPU usage rate from consecutive samples
- read pod-level CPU stats from the parent cgroup rather than the pause container
- add cgroupv2 Pressure Stall Information for CPU, memory, and IO
- add missing `Timestamp` and `Interfaces` fields

when Kubernetes runs with `PodAndContainerStatsFromCRI=true`, it expects `UsageNanoCores` to be set in stats responses.
This value represents how much CPU is being used right now (as opposed to `UsageCoreNanoSeconds` which is cumulative).
To calculate it, we need to compare CPU samples over time to replicate what is in cadvisor.

we can't yet really test this in CI as some changes in kubernetes has to land for `--feature-gates=PodAndContainerStatsFromCRI=true`

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Remove unnecessary variable extraction and Interfaces field,
keeping only the Timestamp addition as originally intended.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
@dims dims force-pushed the updates-for-better-cri-resource-metrics branch from d9a2c14 to 28f7511 Compare December 8, 2025 21:03
@github-project-automation github-project-automation Bot moved this from Needs Triage to Review In Progress in Pull Request Review Dec 9, 2025
@mikebrow mikebrow added this pull request to the merge queue Dec 9, 2025
@dims

dims commented Dec 9, 2025

Copy link
Copy Markdown
Member Author

thanks @mikebrow @akhilerm

Merged via the queue into containerd:main with commit f9b7482 Dec 9, 2025
90 of 92 checks passed
@github-project-automation github-project-automation Bot moved this from Review In Progress to Done in Pull Request Review Dec 9, 2025
@ningmingxiao

ningmingxiao commented Jan 26, 2026

Copy link
Copy Markdown
Contributor

how about add some config in containerd ? Since k8s doesn't support PodAndContainerStatsFromCRI=true now I want to add a config to disable it to reduce cpu/memory use @mikebrow @dims

@ningmingxiao

Copy link
Copy Markdown
Contributor

I also afraid it doesn't work on kata-containers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants