We have found some opportunities for optimization in the dashboards::
-
Top CPU intensive pods gets the max of kubernetes.container.cpu.usage.core.ns, then uses derivative aggregation over it and keeps the positive values. We can simply use kubernetes.container.cpu.usage.node.pct instead and group by the pod name.
-
Same for Top Memory intensive pods
-
CPU Usage by node sums all cpu usage nanocores per container, then uses a painless script to normalise it to the metricset period and groups by the node name. Instead we can use the node metric Kubernetes.node.cpu.usage.nanocores and divide it with kubernetes.node.cpu.allocatable.cores. Same approach is used in metrics UI
-
Same for Memory Usage by node. We can divide kubernetes.node.memory.usage.bytes to kubernetes.node.memory.allocatable.bytes
-
Same approach for network in and out bytes
I tested that by creating a separate dashboard with all those visualisations optimised and the loading time for 24h range decreased from 1m and 10 seconds down to 30 seconds.
We have found some opportunities for optimization in the dashboards::
Top CPU intensive pods gets the max of
kubernetes.container.cpu.usage.core.ns, then uses derivative aggregation over it and keeps the positive values. We can simply usekubernetes.container.cpu.usage.node.pctinstead and group by the pod name.Same for Top Memory intensive pods
CPU Usage by node sums all cpu usage nanocores per container, then uses a painless script to normalise it to the metricset period and groups by the node name. Instead we can use the node metric
Kubernetes.node.cpu.usage.nanocoresand divide it withkubernetes.node.cpu.allocatable.cores. Same approach is used in metrics UISame for Memory Usage by node. We can divide
kubernetes.node.memory.usage.bytestokubernetes.node.memory.allocatable.bytesSame approach for network in and out bytes
I tested that by creating a separate dashboard with all those visualisations optimised and the loading time for 24h range decreased from 1m and 10 seconds down to 30 seconds.