Skip to content

feat: "unrecoverable" annotation#8178

Merged
mnencia merged 4 commits intocloudnative-pg:mainfrom
leonardoce:unrecoverable
Oct 8, 2025
Merged

feat: "unrecoverable" annotation#8178
mnencia merged 4 commits intocloudnative-pg:mainfrom
leonardoce:unrecoverable

Conversation

@leonardoce
Copy link
Contributor

@leonardoce leonardoce commented Jul 29, 2025

This patch introduces the annotation alpha.cnpg.io/unrecoverable=true, which can be added to any replica instance Pod.
When this annotation is present, the operator will permanently delete the associated instance by removing both the PVCs and the Pod itself. Following this, the operator will recreate the missing instance from the primary instance.

Please note that this annotation is available only for replica instances.

@leonardoce leonardoce requested a review from a team as a code owner July 29, 2025 15:39
@cnpg-bot cnpg-bot added backport-requested ◀️ This pull request should be backported to all supported releases release-1.22 release-1.25 release-1.26 labels Jul 29, 2025
@github-actions
Copy link
Contributor

❗ By default, the pull request is configured to backport to all release branches.

  • To stop backporting this pr, remove the label: backport-requested ◀️ or add the label 'do not backport'
  • To stop backporting this pr to a certain release branch, remove the specific branch label: release-x.y

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement 🪄 New feature or request labels Jul 29, 2025
@leonardoce leonardoce force-pushed the unrecoverable branch 2 times, most recently from d77b05b to a344a30 Compare July 29, 2025 15:49
@leonardoce leonardoce marked this pull request as draft July 29, 2025 16:05
@leonardoce
Copy link
Contributor Author

leonardoce commented Jul 29, 2025

This patch is marked as a draft because we miss E2e tests and we should improve the documentation.

@leonardoce leonardoce force-pushed the unrecoverable branch 4 times, most recently from 583f6cf to 4cda8d4 Compare July 30, 2025 08:48
@leonardoce leonardoce marked this pull request as ready for review July 30, 2025 08:48
@leonardoce
Copy link
Contributor Author

I added a basic E2e test.

@dosubot dosubot bot added the E2E tests E2E tests tickets for easy triage on release process label Jul 30, 2025
@leonardoce
Copy link
Contributor Author

/test limit=local

@github-actions
Copy link
Contributor

@leonardoce, here's the link to the E2E on CNPG workflow run: https://github.com/cloudnative-pg/cloudnative-pg/actions/runs/16617908831

@cnpg-bot cnpg-bot added the ok to merge 👌 This PR can be merged label Jul 30, 2025
@armru armru force-pushed the unrecoverable branch 2 times, most recently from 8aa893e to 9673609 Compare July 31, 2025 10:11
@mnencia
Copy link
Member

mnencia commented Aug 25, 2025

I find it a bit counterintuitive to put the annotation on the Pod to trigger deletion of the Pod and all its PVCs. The intent here is clearly storage/instance-scoped, while a Pod is ephemeral. If PVC deletion fails and the Pod disappears for any reason, the request is gone. I’d really prefer to see the trigger live on something more persistent (like the PVC or the Cluster spec/status) so the action is durable and auditable.

@NedAnd1
Copy link

NedAnd1 commented Sep 4, 2025

Should the relevant issue which this PR was intended for be linked to this PR?
It has an outstanding discussion: #7566

@mnencia mnencia force-pushed the unrecoverable branch 5 times, most recently from 52096ab to 8b8f7f2 Compare September 23, 2025 14:25
@mnencia
Copy link
Member

mnencia commented Sep 23, 2025

/test

@github-actions
Copy link
Contributor

@mnencia, here's the link to the E2E on CNPG workflow run: https://github.com/cloudnative-pg/cloudnative-pg/actions/runs/17951525849

@mnencia
Copy link
Member

mnencia commented Sep 24, 2025

I spoke with @leonardoce, and we agreed that the patch is safe, provided we change the order of the delete operations. Specifically, I changed the patch to issue the Pod delete after the PVCs one. I will merge once the end-to-end (E2E) tests have passed.

@NedAnd1
Copy link

NedAnd1 commented Sep 24, 2025

@mnencia @leonardoce
When pods are drained from one node & re-created on another node (via kubectl drain),
volumes for those pods backed by local storage are reprovisioned on that new node
with new, empty underlying storage.
Our CSI driver, which provides these local volumes,
reprovisions them if it notices that a volume has moved to a new node, based on volume annotations,
& it doesn't directly interact with application pods.

We've successfully tested an alternative approach that fixes, within CNPG, the CNPG crash loop observed,
without the need for reasoning about the complex question of how an external component,
especially a general-purpose one, should apply a CNPG-specific annotation to CNPG-specific pods,
when that component doesn't directly interact with external pods in the first place.

@mnencia
Copy link
Member

mnencia commented Oct 7, 2025

/test

@github-actions
Copy link
Contributor

github-actions bot commented Oct 7, 2025

@mnencia, here's the link to the E2E on CNPG workflow run: https://github.com/cloudnative-pg/cloudnative-pg/actions/runs/18319013805

leonardoce and others added 4 commits October 8, 2025 10:33
This patch make the operator look for the "alpha.cnpg.io/unrecoverable=true"
annotation in the PG instance Pods.
If such an annotation is detected, the operator will destroy the PVCs
and the Pod itself and recreate it from the other instances.

This annotation is available only for replica instances.

Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
@mnencia mnencia merged commit 3cc2844 into cloudnative-pg:main Oct 8, 2025
31 checks passed
phisco pushed a commit to phisco/cloudnative-pg that referenced this pull request Oct 9, 2025
This patch introduces the annotation `alpha.cnpg.io/unrecoverable=true`,
which can be added to any replica instance Pod.
When this annotation is present, the operator will permanently delete
the associated instance by removing both the PVCs and the Pod itself.
Following this, the operator will recreate the missing instance from the
primary instance.

Please note that this annotation is available only for replica
instances.

Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Co-authored-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
THE-BRAHMA pushed a commit to THE-BRAHMA/cloudnative-pg that referenced this pull request Oct 30, 2025
This patch introduces the annotation `alpha.cnpg.io/unrecoverable=true`,
which can be added to any replica instance Pod.
When this annotation is present, the operator will permanently delete
the associated instance by removing both the PVCs and the Pod itself.
Following this, the operator will recreate the missing instance from the
primary instance.

Please note that this annotation is available only for replica
instances.

Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Co-authored-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Signed-off-by: theBrahma <office.utpal.brahma@gmail.com>
@jmealo
Copy link
Contributor

jmealo commented Dec 8, 2025

What's the difference between kubectl cnpg destroy and this annotation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do not backport This PR must not be backported - it will be in the next minor release E2E tests E2E tests tickets for easy triage on release process enhancement 🪄 New feature or request lgtm This PR has been approved by a maintainer no-issue ok to merge 👌 This PR can be merged size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants