Skip to content

fix(instance,scaleup): avoid using snapshots if they could be of an older majorVersion#8475

Merged
gbartolini merged 5 commits intomainfrom
dev/7705
Sep 12, 2025
Merged

fix(instance,scaleup): avoid using snapshots if they could be of an older majorVersion#8475
gbartolini merged 5 commits intomainfrom
dev/7705

Conversation

@armru
Copy link
Member

@armru armru commented Aug 29, 2025

Fixes a bug during major upgrades where a volume snapshot backup from a previous minor version could have been incorrectly used to optimise the recreation of replicas.

Depends On #8464
Closes #7705

@armru armru requested review from a team and jsilvela as code owners August 29, 2025 09:08
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Aug 29, 2025
@cnpg-bot cnpg-bot added backport-requested ◀️ This pull request should be backported to all supported releases release-1.25 release-1.26 release-1.27 labels Aug 29, 2025
@github-actions
Copy link
Contributor

❗ By default, the pull request is configured to backport to all release branches.

  • To stop backporting this pr, remove the label: backport-requested ◀️ or add the label 'do not backport'
  • To stop backporting this pr to a certain release branch, remove the specific branch label: release-x.y

@dosubot dosubot bot added the bug 🐛 Something isn't working label Aug 29, 2025
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Aug 29, 2025
@armru armru changed the base branch from main to dev/backup-add-major August 29, 2025 10:42
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Aug 29, 2025
@armru armru force-pushed the dev/backup-add-major branch from 26c394e to 1633d59 Compare August 29, 2025 10:43
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Aug 29, 2025
@armru armru force-pushed the dev/7705 branch 2 times, most recently from 3fe9c49 to 39c3bc1 Compare August 29, 2025 10:47
@armru armru changed the title wip: do not use snapshot of older major verison fix(instance,scaleup): avoid using snapshots if they could be of an older majorVersion Aug 29, 2025
@armru armru force-pushed the dev/7705 branch 2 times, most recently from c7262b6 to 19b92bc Compare August 29, 2025 14:27
@mnencia mnencia force-pushed the dev/backup-add-major branch 2 times, most recently from a053ec0 to 8ded6ac Compare September 3, 2025 15:45
@mnencia mnencia force-pushed the dev/backup-add-major branch from bdb555f to 0cacf2d Compare September 4, 2025 11:21
@mnencia mnencia force-pushed the dev/7705 branch 2 times, most recently from 39cf270 to 014c550 Compare September 9, 2025 14:50
@mnencia mnencia force-pushed the dev/backup-add-major branch from 0cacf2d to 5650cbd Compare September 9, 2025 14:51
@mnencia mnencia force-pushed the dev/7705 branch 2 times, most recently from 80a26c0 to 6415086 Compare September 9, 2025 15:16
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 9, 2025
Base automatically changed from dev/backup-add-major to main September 12, 2025 11:44
@mnencia
Copy link
Member

mnencia commented Sep 12, 2025

/test

@github-actions
Copy link
Contributor

@mnencia, here's the link to the E2E on CNPG workflow run: https://github.com/cloudnative-pg/cloudnative-pg/actions/runs/17673850865

@gbartolini gbartolini added the ok to merge 👌 This PR can be merged label Sep 12, 2025
armru and others added 5 commits September 12, 2025 19:13
…lder majorVersion

Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>

wip

Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>

test: add tests

Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
@gbartolini gbartolini merged commit 62c6bd0 into main Sep 12, 2025
23 of 26 checks passed
@gbartolini gbartolini deleted the dev/7705 branch September 12, 2025 17:15
cnpg-bot pushed a commit that referenced this pull request Sep 12, 2025
… upgrades (#8475)

Avoids a bug where volume snapshot backups from a previous minor version
could be incorrectly reused when scaling up, leading to issues during
major PostgreSQL upgrades.

Depends On #8464
Closes #7705

Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit 62c6bd0)
mnencia added a commit that referenced this pull request Sep 12, 2025
… upgrades (#8475)

Avoids a bug where volume snapshot backups from a previous minor version
could be incorrectly reused when scaling up, leading to issues during
major PostgreSQL upgrades.

Depends On #8464
Closes #7705

Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit 62c6bd0)
rossigee pushed a commit to rossigee/cloudnative-pg that referenced this pull request Oct 2, 2025
… upgrades (cloudnative-pg#8475)

Avoids a bug where volume snapshot backups from a previous minor version
could be incorrectly reused when scaling up, leading to issues during
major PostgreSQL upgrades.

Depends On cloudnative-pg#8464  
Closes cloudnative-pg#7705

Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
@mofirouz
Copy link

Apologies for waking this up - however we've just begun testing and hit the issue where this ticket is attempting to fix. However, I don't think the fix here is sufficient specially for large databases.

When creating replicas it decides how to do it in following code:

	job := specs.JoinReplicaInstance(*cluster, nodeSerial)

	// If we can bootstrap this replica from a pre-existing source, we do it
	storageSource := persistentvolumeclaim.GetCandidateStorageSourceForReplica(ctx, cluster, backupList)
	if storageSource != nil {
		job = specs.RestoreReplicaInstance(*cluster, nodeSerial)
	}

where GetCandidateStorageSourceForReplica looks for snapshot backups or other sources to init replica. Failing that (and this commit will make sure it fails to find previous snapshots due to PG version mismatch) it uses default JoinReplicaInstance init. Default init is barebones postgresql replication setup where replica is given credentials of master, connects there and downloads whole database with pg_basebackup.

This default method won't work because of database sizes.

Proper way of doing it would be to take backup of primary after upgrade using snapshot method and only then create replica, but it doesn't seem to do it. So for big enough instances we'll need to have custom upgrade procedure regardless of CNPG version:

  • scale cluster to 1
  • PG upgrade
  • take backup with snapshot method
  • scale cluster back to original size

I believe the custom upgrade procedure described above should be the proper fix. Does this make sense?

THE-BRAHMA pushed a commit to THE-BRAHMA/cloudnative-pg that referenced this pull request Oct 30, 2025
… upgrades (cloudnative-pg#8475)

Avoids a bug where volume snapshot backups from a previous minor version
could be incorrectly reused when scaling up, leading to issues during
major PostgreSQL upgrades.

Depends On cloudnative-pg#8464
Closes cloudnative-pg#7705

Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Signed-off-by: theBrahma <office.utpal.brahma@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-requested ◀️ This pull request should be backported to all supported releases bug 🐛 Something isn't working lgtm This PR has been approved by a maintainer ok to merge 👌 This PR can be merged release-1.26 release-1.27 size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Errors scaling up after an in-place major upgrade when VolumeSnapshots backup are available

5 participants