Add support for recovery of async/semisync replicas of failed replication group members by ejortegau · Pull Request #1254 · openark/orchestrator

ejortegau · 2020-10-16T21:25:53Z

Related issue: #1253

Description

This PR addresses the issue mentioned above. It does so by adding failure detection and recovery for replication group members that have traditional async/semi-sync replicas.

cc @sjmudd, @dveeden, @luisyonaldo.

shlomi-noach

please see inline comments

go/inst/analysis.go

go/inst/instance_dao.go

shlomi-noach · 2020-10-18T06:05:57Z

go/logic/topology_recovery.go

so we re-use the configuration

but in analysis_dao.go it seems like you've changed that: intermediate master recovery only takes place under

if !a.IsReplicationGroupMember {

What I mean here is that we re-use analysisEntry.ClusterDetails.HasAutomatedIntermediateMasterRecovery configuration to decide whether we want to fail-over group members as opposed to having a separate configuration. As mentioned in the method's doc comment, we are operating under the assumption that group secondaries with replicas are akin to intermediate masters in the sense that they perform a very similar function in the replication chain; get and apply changes from the primary (except, via GR instead of binlog), and distribute them to replicas (via the binlog). I hope this clarifies my intent.

shlomi-noach · 2020-10-18T06:11:56Z

go/logic/topology_recovery.go

I don't run Group Replication myself, but I think it can be debatable whether it is correct to run PostIntermediateMasterFailoverProcesses. For now, let's keep it at that, but I predict that someone in the future will argue against this.

For now, our use case does not seem to require different hooks for these. If the need arises (or someone comes knocking at your door asking for it) I'd be happy to change this to have different GR and intermediate source hooks.

…tion group members.

shlomi-noach reviewed Oct 18, 2020

View reviewed changes

Add support for recovery of async/semisync replicas of failed replica…

efb23b8

…tion group members.

ejortegau force-pushed the master branch from 77f3a3e to efb23b8 Compare October 18, 2020 18:25

shlomi-noach approved these changes Oct 19, 2020

View reviewed changes

shlomi-noach merged commit 37c255e into openark:master Oct 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for recovery of async/semisync replicas of failed replication group members#1254

Add support for recovery of async/semisync replicas of failed replication group members#1254
shlomi-noach merged 1 commit intoopenark:masterfrom
ejortegau:master

ejortegau commented Oct 16, 2020 •

edited

Loading

Uh oh!

shlomi-noach left a comment

Uh oh!

Uh oh!

Uh oh!

shlomi-noach Oct 18, 2020

Uh oh!

ejortegau Oct 18, 2020

Uh oh!

shlomi-noach Oct 18, 2020

Uh oh!

ejortegau Oct 18, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ejortegau commented Oct 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

shlomi-noach left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

shlomi-noach Oct 18, 2020

Choose a reason for hiding this comment

Uh oh!

ejortegau Oct 18, 2020

Choose a reason for hiding this comment

Uh oh!

shlomi-noach Oct 18, 2020

Choose a reason for hiding this comment

Uh oh!

ejortegau Oct 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ejortegau commented Oct 16, 2020 •

edited

Loading

ejortegau Oct 18, 2020 •

edited

Loading