-
Notifications
You must be signed in to change notification settings - Fork 632
Closed
Copy link
Labels
Milestone
Description
Version
trunk (main)
What happened?
i'm reviewing the code in reconcileTargetPrimaryForNonReplicaCluster() and it might be missing some logic?
i'm thinking automatic promotion as part of self-healing is only safe (ie. no data loss) in conjunction with sync replication. i'm not sure we should fully trust any methodologies based on trying to flush WAL.
Edit (4-May-2025): can skip this initial design proposal below the line here, and just go straight to the design proposal in the first comments.
- Quorum-Based: assuming that standbyNamesPre/Post and maxStandbyNamesFromCluster are unset, only consider promoting the first N entries in instancesStatus[], where N is
synchronous.number - unreachable_replicas.- (if synchronous.number == 0 then we should not auto promote anyone. possibly we should default synchronous.number to
1on all cnpg clusters with at least one replica? and shoulddataDurabilitydefault topreferredwith 1 replica andrequiredwith 2+ replicas?) - still need to work out the formula when standbyNamesPre/Post or maxStandbyNamesFromCluster are present
- still need to think through
preferreddurability a bit more - with
synchronous.number=1even a single unreachable replica would mean no promotion, but i think this is likely the behavior we want? would it make sense to have the default to up to2if there are 3+ replicas? i think we always want at least one async replica available for pod disruption budgets during maintenance.
- (if synchronous.number == 0 then we should not auto promote anyone. possibly we should default synchronous.number to
- Priority-Based: only consider promoting entries in instancesStatus[] that are known sync replicas?
- might be a race condition where an unavailable higher-priority replica has just come back online, and postgres has just demoted the final
syncreplica back topotential, but cnpg is not yet aware of this fact. if cnpg promotes the replica that it thought wassyncthen there could be data loss.
- might be a race condition where an unavailable higher-priority replica has just come back online, and postgres has just demoted the final
this is fairly complex to reason through, so would be good to discuss further
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Done
Status
Done