Describe the enhancement:
The system.process dataset doesn't work on systems using cgroupv2. The dataset should detect cgroupv2 and read CPU usage/quota/period from the right files to collect the same values it does for cgroupv1. At a minimum, we need:
- CPU quota (from
cpu.cfs_quota_us in v1, or parsed from cpu.max in v2)
- CPU period (from
cpu.cfs_period_us in v1, or parsed from cpu.max in v2)
- CPU shares/weight (from
cpu.shares in v1, or cpu.weight in v2)
In cgroupv1, cpu and cpuacct are separate controllers (though often mounted together as cpu,cpuacct). You read CPU usage from the cpuacct controller at /sys/fs/cgroup/cpu,cpuacct/<cgroup>/cpuacct.usage (nanoseconds).
Limits are in the cpu controller at cpu.cfs_quota_us and cpu.cfs_period_us (or -1 for unlimited). When there are no limits, proportional allocation uses cpu.shares (default 1024). The metricset currently does this and it works fine.
In cgroupv2, the controllers are unified into a single hierarchy under /sys/fs/cgroup/<cgroup>/. The cpu and cpuacct controllers are combined, so CPU usage is now in cpu.stat as usage_usec (note: microseconds, not nanoseconds) instead of the separate cpuacct.usage file.
Limits are in cpu.max as a space-separated pair like "50000 100000" (or "max" for unlimited). When there are no limits, proportional allocation uses cpu.weight (default 100, range 1-10000) instead of cpu.shares.
Describe a specific use case for the enhancement or feature:
Right now, system.process.cgroup.cpuacct.total.pct and the quota, period, and shares fields are not collected on cgroupv2 systems. This makes it impossible to observe CPU utilization for processes using cgroup v2. We can't do any capacity planning or resource optimization without these metrics. At a minimum, we need CPU quota, CPU period, and CPU shares/weight to be collected.
It would be great if the metricset could detect cgroupv2 and read from the right files (e.g. cpu.stat and cpu.max) to collect these values. What fields they should live in looks like it's covered by an existing issue: #26871
Describe the enhancement:
The
system.processdataset doesn't work on systems using cgroupv2. The dataset should detect cgroupv2 and read CPU usage/quota/period from the right files to collect the same values it does for cgroupv1. At a minimum, we need:cpu.cfs_quota_usin v1, or parsed fromcpu.maxin v2)cpu.cfs_period_usin v1, or parsed fromcpu.maxin v2)cpu.sharesin v1, orcpu.weightin v2)In cgroupv1,
cpuandcpuacctare separate controllers (though often mounted together ascpu,cpuacct). You read CPU usage from thecpuacctcontroller at/sys/fs/cgroup/cpu,cpuacct/<cgroup>/cpuacct.usage(nanoseconds).Limits are in the
cpucontroller atcpu.cfs_quota_usandcpu.cfs_period_us(or-1for unlimited). When there are no limits, proportional allocation usescpu.shares(default 1024). The metricset currently does this and it works fine.In cgroupv2, the controllers are unified into a single hierarchy under
/sys/fs/cgroup/<cgroup>/. Thecpuandcpuacctcontrollers are combined, so CPU usage is now incpu.statasusage_usec(note: microseconds, not nanoseconds) instead of the separatecpuacct.usagefile.Limits are in
cpu.maxas a space-separated pair like "50000 100000" (or "max" for unlimited). When there are no limits, proportional allocation usescpu.weight(default 100, range 1-10000) instead ofcpu.shares.Describe a specific use case for the enhancement or feature:
Right now,
system.process.cgroup.cpuacct.total.pctand the quota, period, and shares fields are not collected on cgroupv2 systems. This makes it impossible to observe CPU utilization for processes using cgroup v2. We can't do any capacity planning or resource optimization without these metrics. At a minimum, we need CPU quota, CPU period, and CPU shares/weight to be collected.It would be great if the metricset could detect cgroupv2 and read from the right files (e.g.
cpu.statandcpu.max) to collect these values. What fields they should live in looks like it's covered by an existing issue: #26871