cri: Add background stats collector to calculate UsageNanoCores#12629
Conversation
b7f4169 to
cd3878c
Compare
|
@akhilerm @mikebrow please review when you get a chance. I have not yet turned on the feature flags to allow this to be tested as that needs kubernetes/kubernetes#135604 to land first after k8s code freeze. Once you are happy we can land this one to prepare for it (chicken/egg!) |
There was a problem hiding this comment.
Very nice! see comments..
once we have "switched" to timed for a container.. do we need to/should we flush the original old method ContainerStats by setting it back to nil
this might be pretty expensive to collect and stage. there is a risk that ticker gets backed up due to locks in container store list and callouts to task metrics for the set of all k8s.io container ids....
3bbc717 to
7606b6d
Compare
|
cc @fuweid |
c60bf6c to
4373a80
Compare
4373a80 to
0113ff9
Compare
akhilerm
left a comment
There was a problem hiding this comment.
Got a few questions and comments.
80338c0 to
d9a2c14
Compare
|
@akhilerm fixed up all the things you pointed out. please peek again. |
adds a background stats collector that calculates `UsageNanoCores` for containers and pod sandboxes. - run in the background every second to collect CPU metrics for all containers and sandboxes (similar to what cAdvisor does) - keep a rolling buffer of CPU samples and calculates the instantaneous CPU usage rate from consecutive samples - read pod-level CPU stats from the parent cgroup rather than the pause container - add cgroupv2 Pressure Stall Information for CPU, memory, and IO - add missing `Timestamp` and `Interfaces` fields when Kubernetes runs with `PodAndContainerStatsFromCRI=true`, it expects `UsageNanoCores` to be set in stats responses. This value represents how much CPU is being used right now (as opposed to `UsageCoreNanoSeconds` which is cumulative). To calculate it, we need to compare CPU samples over time to replicate what is in cadvisor. we can't yet really test this in CI as some changes in kubernetes has to land for `--feature-gates=PodAndContainerStatsFromCRI=true` Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Remove unnecessary variable extraction and Interfaces field, keeping only the Timestamp addition as originally intended. Signed-off-by: Davanum Srinivas <davanum@gmail.com>
d9a2c14 to
28f7511
Compare
|
I also afraid it doesn't work on kata-containers |
adds a background stats collector that calculates
UsageNanoCoresfor containers and pod sandboxes.TimestampandInterfacesfieldswhen Kubernetes runs with
PodAndContainerStatsFromCRI=true, it expectsUsageNanoCoresto be set in stats responses. This value represents how much CPU is being used right now (as opposed toUsageCoreNanoSecondswhich is cumulative). To calculate it, we need to compare CPU samples over time to replicate what is in cadvisor.we can't yet really test this in CI as some changes in kubernetes has to land for
--feature-gates=PodAndContainerStatsFromCRI=truePS: see #12620 where we tested against a forked branch to get things working