Skip to content

[Bug]:pg_stat_archiver metrics not reset after demotion (former primary still reports stale archive age) #9101

@tickmake

Description

@tickmake

Is there an existing issue already for this bug?

  • I have searched for an existing issue, and could not find anything. I believe this is a new bug.

I have read the troubleshooting guide

  • I have read the troubleshooting guide and I think this is a new bug.

I am running a supported version of CloudNativePG

  • I have read the troubleshooting guide and I think this is a new bug.

Contact Details

sunil4356@gmail.com

Version

1.27 (latest patch)

What version of Kubernetes are you using?

1.33

What is your Kubernetes environment?

Self-managed: RKE

How did you install the operator?

YAML manifest

What happened?

Summary

When a PostgreSQL pod in a CloudNativePG cluster is promoted to primary, it correctly performs WAL archiving and exposes related metrics (e.g., cnpg_pg_stat_archiver_seconds_since_last_archival).
However, when that same pod is demoted back to a replica, the archiver statistics (pg_stat_archiver) remain populated with old data.

This results in:

  • False alerts (e.g., "last archive age > 7 minutes") continuing to fire on replicas.

  • Misleading observability, as standby nodes appear to have stale archiver activity.


Environment

Component | Version -- | -- CloudNativePG Operator | 1.27.1 PostgreSQL | 17.6 barman-cloud.cloudnative-pg-io | 0.7.0 Backup target | MinIO bucket Kubernetes | (e.g. 1.29) Cluster topology | 3-node HA (1 primary + 2 replicas) Monitoring | Prometheus + CNPG metrics exporter

Steps to Reproduce

  1. Deploy a 3-node CNPG cluster with backup configured via barman cloud plugin (MinIO).

  2. Observe normal WAL archiving metrics on the current primary:

    SELECT last_archived_wal, last_archived_time FROM pg_stat_archiver;
  3. Perform a manual switchover:

    kubectl cnpg promote postgres-cluster1 --target <replica-pod>
  4. The original primary becomes a replica.

  5. Observe on the demoted pod:

    cnpg_pg_stat_archiver_seconds_since_last_archival{pod="<demoted-pod>"}

    → The value continues increasing (e.g. several hours/days), even though the node is no longer archiving.


Actual Behavior

  • The demoted pod still exposes stale values in pg_stat_archiver:

    • last_archived_wal

    • last_archived_time

    • seconds_since_last_archival

  • Prometheus alerts continue firing for these pods.

Example alert rule:

sum by (pod)( cnpg_pg_stat_archiver_seconds_since_last_archival{ namespace="postgres-cluster1", pod=~"postgres-cluster1-[0-9]+$" } ) > 600

Expected Behavior

When a pod transitions from primary → replica, CloudNativePG should:

  • Automatically clear the pg_stat_archiver statistics (pg_stat_reset_shared('archiver')), or

  • Suppress the pg_stat_archiver_* metrics entirely for replicas.

Only the current primary should report active archiver metrics.


Supporting Evidence

  • PostgreSQL itself does not reset pg_stat_archiver automatically upon demotion.

  • Running manually fixes the issue:

    SELECT pg_stat_reset_shared('archiver');
  • Restarting the pod also resets the stats.

  • Related issue: CNPG #6544 — “WAL cleanup inconsistency on replica nodes after cluster role changes.”


Suggested Fixes

  1. Operator behavior change
    On role transition (primary → replica), CNPG could execute:

    SELECT pg_stat_reset_shared('archiver');

    as part of the demotion sequence, ensuring archiver metrics are cleared immediately.

  2. Exporter filtering
    Modify the CNPG metrics exporter to exclude archiver metrics unless role="primary".
    This is low risk and resolves most monitoring noise without touching PostgreSQL internals.

  3. Documentation update
    Mention that pg_stat_archiver metrics are only relevant for the current primary and may appear stale on replicas following failover.


Workarounds

  • Manual reset:

    SELECT pg_stat_reset_shared('archiver');
  • Prometheus rule adjustment:
    Add a role filter:

    cnpg_pg_stat_archiver_seconds_since_last_archival{role="primary"} > 600

Impact

  • False positive alerts on replica pods after switchover/failover.

  • Misleading monitoring dashboards showing stale archiver activity.

  • Reduces operational clarity for teams monitoring WAL archiving SLA compliance.


💡 Proposed Implementation

Option A – Operator-level cleanup (preferred):

  • Detect role transition event (Primary → Replica) in the CNPG controller.

  • Execute a lightweight SQL command on the demoted instance:

    SELECT pg_stat_reset_shared('archiver');
  • This can be done as part of the post-demotion reconciliation logic (where archive_mode becomes off).

Option B – Exporter-level filtering:

  • Enhance cnpg-metrics-exporter to include a condition:

    if role != "primary": skip pg_stat_archiver metrics
  • Ensures no seconds_since_last_archival or similar metrics are emitted for standbys.

Option C – Combined approach:

  • Apply Operator-level reset for correctness.

  • Apply Exporter-level filtering for observability hygiene.

This ensures that metrics, dashboards, and alerts all stay consistent and role-aware without manual intervention.

Cluster resource

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:  
  name: postgres-cluster1
  namespace: postgres-cluster1
spec:
  affinity:
    enablePodAntiAffinity: true
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: cnpg.postgres-cluster1
                operator: In
                values:
                  - 'true'
    podAntiAffinityType: preferred
    tolerations:
      - effect: NoSchedule
        key: dedicated
        operator: Equal
        value: postgres-cluster1
  backup:
    target: prefer-standby
  bootstrap:
    initdb:
      database: app
      encoding: UTF8
      localeCType: C
      localeCollate: C
      owner: app
  enablePDB: true
  enableSuperuserAccess: true
  failoverDelay: 0
  imageName: ghcr.io/cloudnative-pg/postgresql:17.6
  instances: 3
  logLevel: info
  maxSyncReplicas: 0
  minSyncReplicas: 0
  monitoring:
    customQueriesConfigMap:
      - key: queries
        name: cnpg-default-monitoring
    disableDefaultQueries: false
    enablePodMonitor: true
  plugins:
    - enabled: true
      isWALArchiver: true
      name: barman-cloud.cloudnative-pg.io
      parameters:
        barmanObjectName: prod-s3-creds
  postgresGID: 26
  postgresUID: 26
  postgresql:
    parameters:
      archive_mode: 'on'
      archive_timeout: 5min
      dynamic_shared_memory_type: posix
      full_page_writes: 'on'
      log_destination: csvlog
      log_directory: /controller/log
      log_filename: postgres
      log_rotation_age: '0'
      log_rotation_size: '0'
      log_truncate_on_rotation: 'false'
      logging_collector: 'on'
      max_parallel_workers: '32'
      max_replication_slots: '32'
      max_worker_processes: '32'
      shared_memory_type: mmap
      shared_preload_libraries: ''
      ssl_max_protocol_version: TLSv1.3
      ssl_min_protocol_version: TLSv1.3
      wal_keep_size: 512MB
      wal_level: logical
      wal_log_hints: 'on'
      wal_receiver_timeout: 5s
      wal_sender_timeout: 5s
    syncReplicaElectionConstraint:
      enabled: false
  primaryUpdateMethod: switchover
  primaryUpdateStrategy: unsupervised
  probes:
    liveness:
      isolationCheck:
        connectionTimeout: 1000
        enabled: true
        requestTimeout: 1000
  replicationSlots:
    highAvailability:
      enabled: true
      slotPrefix: _cnpg_
    synchronizeReplicas:
      enabled: true
    updateInterval: 30
  resources: {}
  smartShutdownTimeout: 180
  startDelay: 3600
  stopDelay: 1800
  storage:
    resizeInUseVolumes: true
    size: 1Ti
    storageClass: local-storage
  superuserSecret:
    name: postgres-cluster1-superuser
  switchoverDelay: 3600

Relevant log output

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

Labels

triagePending triage

Type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions