-
Notifications
You must be signed in to change notification settings - Fork 632
Description
Is there an existing issue already for this bug?
- I have searched for an existing issue, and could not find anything. I believe this is a new bug.
I have read the troubleshooting guide
- I have read the troubleshooting guide and I think this is a new bug.
I am running a supported version of CloudNativePG
- I have read the troubleshooting guide and I think this is a new bug.
Contact Details
Version
1.27 (latest patch)
What version of Kubernetes are you using?
1.33
What is your Kubernetes environment?
Self-managed: RKE
How did you install the operator?
YAML manifest
What happened?
Summary
When a PostgreSQL pod in a CloudNativePG cluster is promoted to primary, it correctly performs WAL archiving and exposes related metrics (e.g., cnpg_pg_stat_archiver_seconds_since_last_archival).
However, when that same pod is demoted back to a replica, the archiver statistics (pg_stat_archiver) remain populated with old data.
This results in:
-
False alerts (e.g., "last archive age > 7 minutes") continuing to fire on replicas.
-
Misleading observability, as standby nodes appear to have stale archiver activity.
Environment
Steps to Reproduce
-
Deploy a 3-node CNPG cluster with backup configured via
barman cloud plugin(MinIO). -
Observe normal WAL archiving metrics on the current primary:
SELECT last_archived_wal, last_archived_time FROM pg_stat_archiver; -
Perform a manual switchover:
kubectl cnpg promote postgres-cluster1 --target <replica-pod> -
The original primary becomes a replica.
-
Observe on the demoted pod:
cnpg_pg_stat_archiver_seconds_since_last_archival{pod="<demoted-pod>"}→ The value continues increasing (e.g. several hours/days), even though the node is no longer archiving.
Actual Behavior
-
The demoted pod still exposes stale values in
pg_stat_archiver:-
last_archived_wal -
last_archived_time -
seconds_since_last_archival
-
-
Prometheus alerts continue firing for these pods.
Example alert rule:
sum by (pod)( cnpg_pg_stat_archiver_seconds_since_last_archival{ namespace="postgres-cluster1", pod=~"postgres-cluster1-[0-9]+$" } ) > 600
Expected Behavior
When a pod transitions from primary → replica, CloudNativePG should:
-
Automatically clear the
pg_stat_archiverstatistics (pg_stat_reset_shared('archiver')), or -
Suppress the
pg_stat_archiver_*metrics entirely for replicas.
Only the current primary should report active archiver metrics.
Supporting Evidence
-
PostgreSQL itself does not reset
pg_stat_archiverautomatically upon demotion. -
Running manually fixes the issue:
SELECT pg_stat_reset_shared('archiver'); -
Restarting the pod also resets the stats.
-
Related issue: CNPG #6544 — “WAL cleanup inconsistency on replica nodes after cluster role changes.”
Suggested Fixes
-
Operator behavior change
On role transition (primary → replica), CNPG could execute:SELECT pg_stat_reset_shared('archiver');as part of the demotion sequence, ensuring archiver metrics are cleared immediately.
-
Exporter filtering
Modify the CNPG metrics exporter to exclude archiver metrics unlessrole="primary".
This is low risk and resolves most monitoring noise without touching PostgreSQL internals. -
Documentation update
Mention thatpg_stat_archivermetrics are only relevant for the current primary and may appear stale on replicas following failover.
Workarounds
-
Manual reset:
SELECT pg_stat_reset_shared('archiver'); -
Prometheus rule adjustment:
Add a role filter:cnpg_pg_stat_archiver_seconds_since_last_archival{role="primary"} > 600
Impact
-
False positive alerts on replica pods after switchover/failover.
-
Misleading monitoring dashboards showing stale archiver activity.
-
Reduces operational clarity for teams monitoring WAL archiving SLA compliance.
💡 Proposed Implementation
Option A – Operator-level cleanup (preferred):
-
Detect role transition event (Primary → Replica) in the CNPG controller.
-
Execute a lightweight SQL command on the demoted instance:
SELECT pg_stat_reset_shared('archiver'); -
This can be done as part of the post-demotion reconciliation logic (where
archive_modebecomesoff).
Option B – Exporter-level filtering:
-
Enhance
cnpg-metrics-exporterto include a condition:if role != "primary": skip pg_stat_archiver metrics -
Ensures no
seconds_since_last_archivalor similar metrics are emitted for standbys.
Option C – Combined approach:
-
Apply Operator-level reset for correctness.
-
Apply Exporter-level filtering for observability hygiene.
This ensures that metrics, dashboards, and alerts all stay consistent and role-aware without manual intervention.
Cluster resource
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: postgres-cluster1
namespace: postgres-cluster1
spec:
affinity:
enablePodAntiAffinity: true
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cnpg.postgres-cluster1
operator: In
values:
- 'true'
podAntiAffinityType: preferred
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: postgres-cluster1
backup:
target: prefer-standby
bootstrap:
initdb:
database: app
encoding: UTF8
localeCType: C
localeCollate: C
owner: app
enablePDB: true
enableSuperuserAccess: true
failoverDelay: 0
imageName: ghcr.io/cloudnative-pg/postgresql:17.6
instances: 3
logLevel: info
maxSyncReplicas: 0
minSyncReplicas: 0
monitoring:
customQueriesConfigMap:
- key: queries
name: cnpg-default-monitoring
disableDefaultQueries: false
enablePodMonitor: true
plugins:
- enabled: true
isWALArchiver: true
name: barman-cloud.cloudnative-pg.io
parameters:
barmanObjectName: prod-s3-creds
postgresGID: 26
postgresUID: 26
postgresql:
parameters:
archive_mode: 'on'
archive_timeout: 5min
dynamic_shared_memory_type: posix
full_page_writes: 'on'
log_destination: csvlog
log_directory: /controller/log
log_filename: postgres
log_rotation_age: '0'
log_rotation_size: '0'
log_truncate_on_rotation: 'false'
logging_collector: 'on'
max_parallel_workers: '32'
max_replication_slots: '32'
max_worker_processes: '32'
shared_memory_type: mmap
shared_preload_libraries: ''
ssl_max_protocol_version: TLSv1.3
ssl_min_protocol_version: TLSv1.3
wal_keep_size: 512MB
wal_level: logical
wal_log_hints: 'on'
wal_receiver_timeout: 5s
wal_sender_timeout: 5s
syncReplicaElectionConstraint:
enabled: false
primaryUpdateMethod: switchover
primaryUpdateStrategy: unsupervised
probes:
liveness:
isolationCheck:
connectionTimeout: 1000
enabled: true
requestTimeout: 1000
replicationSlots:
highAvailability:
enabled: true
slotPrefix: _cnpg_
synchronizeReplicas:
enabled: true
updateInterval: 30
resources: {}
smartShutdownTimeout: 180
startDelay: 3600
stopDelay: 1800
storage:
resizeInUseVolumes: true
size: 1Ti
storageClass: local-storage
superuserSecret:
name: postgres-cluster1-superuser
switchoverDelay: 3600Relevant log output
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
Type
Projects
Status