Skip to content
This repository was archived by the owner on Jul 24, 2024. It is now read-only.
This repository was archived by the owner on Jul 24, 2024. It is now read-only.

BR should retry RegionError on BatchSplitRegions  #219

@overvenus

Description

@overvenus

Integration test fails #214 (comment)

pretty printed backtrace

[2020-03-30T09:59:53.436Z] [2020/03/30 17:59:53.217 +08:00] [ERROR] [restore.go:238] ["split regions failed"] [error="split region failed: region=id:3828 start_key:\"t\\200\\000\\000\\000\\000\\000\\003\\377\\275_r\\000\\000\\000\\000\\000\\372\" end_key:\"t\\200\\000\\000\\000\\000\\000\\003\\377\\277\\000\\000\\000\\000\\000\\000\\000\\370\" region_epoch:<conf_ver:20 version:1049 > peers:<id:3829 store_id:6 > peers:<id:3830 store_id:1 > peers:<id:3831 store_id:5 > , err=message:\"peer is not leader for region 3828, leader may Some(id: 3831 store_id: 5)\" not_leader:<region_id:3828 leader:<id:3831 store_id:5 > > "] [errorVerbose="split region failed: region=id:3828 start_key:\"t\\200\\000\\000\\000\\000\\000\\003\\377\\275_r\\000\\000\\000\\000\\000\\372\" end_key:\"t\\200\\000\\000\\000\\000\\000\\003\\377\\277\\000\\000\\000\\000\\000\\000\\000\\370\" region_epoch:<conf_ver:20 version:1049 > peers:<id:3829 store_id:6 > peers:<id:3830 store_id:1 > peers:<id:3831 store_id:5 > , err=message:\"peer is not leader for region 3828, leader may Some(id: 3831 store_id: 5)\" not_leader:<region_id:3828 leader:<id:3831 store_id:5 > > 
github.com/pingcap/br/pkg/restore.(*pdClient).BatchSplitRegions
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/pkg/restore/split_client.go:230
github.com/pingcap/br/pkg/restore.(*RegionSplitter).splitAndScatterRegions
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/pkg/restore/split.go:316
github.com/pingcap/br/pkg/restore.(*RegionSplitter).Split
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/pkg/restore/split.go:118
github.com/pingcap/br/pkg/restore.SplitRanges
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/pkg/restore/util.go:344
github.com/pingcap/br/pkg/task.RunRestore
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/pkg/task/restore.go:236
github.com/pingcap/br/cmd.runRestoreCommand
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/cmd/restore.go:21
github.com/pingcap/br/cmd.newDbRestoreCommand.func1
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/cmd/restore.go:93
github.com/spf13/cobra.(*Command).execute
	/go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:826
github.com/spf13/cobra.(*Command).ExecuteC
	/go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:914
github.com/spf13/cobra.(*Command).Execute
	/go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:864
github.com/pingcap/br.main
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/main.go:54
github.com/pingcap/br.TestRunMain.func1
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/main_test.go:39
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1357"] [stack="github.com/pingcap/log.Error
	/go/pkg/mod/github.com/pingcap/log@v0.0.0-20200117041106-d28c14d3b1cd/global.go:42
github.com/pingcap/br/pkg/task.RunRestore
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/pkg/task/restore.go:238
github.com/pingcap/br/cmd.runRestoreCommand
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/cmd/restore.go:21
github.com/pingcap/br/cmd.newDbRestoreCommand.func1
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/cmd/restore.go:93
github.com/spf13/cobra.(*Command).execute
	/go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:826
github.com/spf13/cobra.(*Command).ExecuteC
	/go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:914
github.com/spf13/cobra.(*Command).Execute
	/go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:864
github.com/pingcap/br.main
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/main.go:54
github.com/pingcap/br.TestRunMain.func1
	/home/jenkins/agent/workspace/br_ghpr_unit_and_integration_test/go/src/github.com/pingcap/br/main_test.go:39"]

resp, err := client.SplitRegion(ctx, &kvrpcpb.SplitRegionRequest{
Context: &kvrpcpb.Context{
RegionId: regionInfo.Region.Id,
RegionEpoch: regionInfo.Region.RegionEpoch,
Peer: peer,
},
SplitKeys: keys,
})
if err != nil {
return nil, err
}
if resp.RegionError != nil {
return nil, errors.Errorf("split region failed: region=%v, err=%v", regionInfo.Region, resp.RegionError)
}

BR should retry on RegionError:

  • NotLeader
  • RegionNotFound
  • EpochNotMatch
  • ServerIsBusy
  • StaleCommand

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions