Skip to content

feat: add node join reverter#6

Merged
louiseschmidtgen merged 4 commits into
mainfrom
KU-4853/join-revert
Jan 29, 2026
Merged

feat: add node join reverter#6
louiseschmidtgen merged 4 commits into
mainfrom
KU-4853/join-revert

Conversation

@louiseschmidtgen

@louiseschmidtgen louiseschmidtgen commented Jan 27, 2026

Copy link
Copy Markdown
Contributor

Description

This PR adds an in memory reverter for membership state.

If a node fails anywhere in the process membership get's reverted based on where we failed in the join process:

  • any failure of the join logic triggers the microcluster's remove logic
  • etcd membership
  • Kubernetes membership

Limitation:

Clean-up of etcd is only desirable if we have less than 3 nodes in the cluster. If we were to remove 1 etcd node from a 2 node configuration we would loose quorum. Admins will need to manually need to clean-up the broken configuration.

The test case does not include a case for reverting etcd as there is no suitable etcd client mock available. Creating one goes beyond the scope of this PR. We would need a full integration test with failure injection.

Backport

1.32, 1.33, 1.34, 1.35

Reference

Closes canonical/k8s-snap#2309 since k8sd was moved out of the k8s-snap.

Checklist

  • PR title formatted as type: title
  • Covered by unit tests
  • Covered by integration tests
  • Documentation updated
  • CLA signed
  • Backport label added if necessary

Copilot AI review requested due to automatic review settings January 27, 2026 12:00

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an in-memory reverter mechanism for cleaning up membership state when a node join operation fails. The implementation provides automatic rollback of three key components: k8s-dqlite state, etcd membership, and Kubernetes Node objects.

Changes:

  • Added three reverter registration functions that clean up datastore state, etcd membership, and Kubernetes node objects on join failure
  • Integrated reverters into the onPostJoin hook with proper ordering after each setup step
  • Included quorum protection for etcd cleanup (only removes member if cluster has 3+ nodes)
  • Added comprehensive unit tests covering success and failure scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/k8sd/app/hooks_join.go Implements three reverter registration functions and integrates them into the onPostJoin flow at appropriate stages
pkg/k8sd/app/hooks_join_reverter_test.go Adds unit tests for all three reverter types, covering both success and failure scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/k8sd/app/hooks_join_test.go
Comment thread pkg/k8sd/app/hooks_join.go Outdated

Copilot AI commented Jan 27, 2026

Copy link
Copy Markdown
Contributor

@louiseschmidtgen I've opened a new pull request, #7, to work on those changes. Once the pull request is ready, I'll request review from you.

Signed-off-by: Louise K. Schmidtgen <louise.schmidtgen@canonical.com>
Signed-off-by: Louise K. Schmidtgen <louise.schmidtgen@canonical.com>
Signed-off-by: Louise K. Schmidtgen <louise.schmidtgen@canonical.com>
Comment thread pkg/k8sd/app/hooks_join.go Outdated
Comment thread pkg/k8sd/app/hooks_join_test.go
Comment thread pkg/k8sd/app/hooks_join_reverter_test.go Outdated
Comment thread pkg/k8sd/app/hooks_join.go Outdated
Signed-off-by: Louise K. Schmidtgen <louise.schmidtgen@canonical.com>
@louiseschmidtgen louiseschmidtgen merged commit 24d02f5 into main Jan 29, 2026
4 of 5 checks passed
@louiseschmidtgen louiseschmidtgen deleted the KU-4853/join-revert branch January 29, 2026 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants