Skip to content

Commit f02e312

Browse files
committed
monitoring: add 2 nvmeof alerts to prometheus_alerts.yaml
- `NVMeoFMissingListener`: trigger if all listeners are not created for each gateway in a subsystem - `NVMeoFZeroListenerSubsystem`: trigger if a subsystem has no listeners Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
1 parent 505f1a6 commit f02e312

1 file changed

Lines changed: 18 additions & 0 deletions

File tree

monitoring/ceph-mixin/prometheus_alerts.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -837,6 +837,24 @@ groups:
837837
labels:
838838
severity: "warning"
839839
type: "ceph_default"
840+
- alert: "NVMeoFMissingListener"
841+
annotations:
842+
description: "For every subsystem, each gateway should have a listener to balance traffic between gateways."
843+
summary: "No listener added for {{ $labels.instance }} NVMe-oF Gateway to {{ $labels.nqn }} subsystem"
844+
expr: "ceph_nvmeof_subsystem_listener_count == 0 and on(nqn) sum(ceph_nvmeof_subsystem_listener_count) by (nqn) > 0"
845+
for: "10m"
846+
labels:
847+
severity: "warning"
848+
type: "ceph_default"
849+
- alert: "NVMeoFZeroListenerSubsystem"
850+
annotations:
851+
description: "NVMeoF gateway configuration incomplete; one of the subsystems have zero listeners."
852+
summary: "No listeners added to {{ $labels.nqn }} subsystem"
853+
expr: "sum(ceph_nvmeof_subsystem_listener_count) by (nqn) == 0"
854+
for: "10m"
855+
labels:
856+
severity: "warning"
857+
type: "ceph_default"
840858
- alert: "NVMeoFHighHostCPU"
841859
annotations:
842860
description: "High CPU on a gateway host can lead to CPU contention and performance degradation"

0 commit comments

Comments
 (0)