Skip to content
This repository was archived by the owner on Jul 24, 2024. It is now read-only.
This repository was archived by the owner on Jul 24, 2024. It is now read-only.

Update GC safePoint with TTL failed due to DeadlineExceeded #324

@overvenus

Description

@overvenus

Please answer these questions before submitting your issue. Thanks!

  1. What did you do?

BR backup full but failed with

[2020-05-28T06:31:24.127Z] [2020/05/28 14:31:23.948 +08:00] [ERROR] [client.go:408] ["update GC safePoint with TTL failed"] [error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"] [errorVerbose="rpc error: code = DeadlineExceeded desc = context deadline exceeded\ngithub.com/pingcap/pd/v4/client.(*client).UpdateServiceGCSafePoint\n\t/go/pkg/mod/github.com/pingcap/pd/v4@v4.0.0-rc.2.0.20200520083007-2c251bd8f181/client/client.go:662\ngithub.com/pingcap/br/pkg/backup.UpdateServiceSafePoint\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/backup/safe_point.go:51\ngithub.com/pingcap/br/pkg/backup.(*Client).BackupRanges\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/backup/client.go:406\ngithub.com/pingcap/br/pkg/task.RunBackup\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/task/backup.go:188\ngithub.com/pingcap/br/cmd.runBackupCommand\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/cmd/backup.go:22\ngithub.com/pingcap/br/cmd.newTableBackupCommand.func1\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/cmd/backup.go:99\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:842\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\ngithub.com/pingcap/br.main\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/main.go:54\ngithub.com/pingcap/br.TestRunMain.func1\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/main_test.go:39\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"] [stack="github.com/pingcap/log.Error\n\t/go/pkg/mod/github.com/pingcap/log@v0.0.0-20200511115504-543df19646ad/global.go:42\ngithub.com/pingcap/br/pkg/backup.(*Client).BackupRanges\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/backup/client.go:408\ngithub.com/pingcap/br/pkg/task.RunBackup\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/task/backup.go:188\ngithub.com/pingcap/br/cmd.runBackupCommand\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/cmd/backup.go:22\ngithub.com/pingcap/br/cmd.newTableBackupCommand.func1\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/cmd/backup.go:99\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:842\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887\ngithub.com/pingcap/br.main\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/main.go:54\ngithub.com/pingcap/br.TestRunMain.func1\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/main_test.go:39"]
[2020-05-28T06:31:24.403Z] [2020/05/28 14:31:24.213 +08:00] [INFO] [collector.go:172] ["Table backup Failed summary : total backup ranges: 1, total success: 0, total failed: 1"] ["backup total regions"=2] [unitName="range start:74800000000000002f5f720000000000000000 end:74800000000000002f5f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/pingcap/errors.AddStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:15\ngithub.com/pingcap/br/pkg/backup.SendBackup\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/backup/client.go:792\ngithub.com/pingcap/br/pkg/backup.(*pushDown).pushBackup.func1\n\t/home/jenkins/agent/workspace/tikv_ghpr_integration_br_test/go/src/github.com/pingcap/br/pkg/backup/push.go:61\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"]

[2020-05-28T06:31:24.403Z] Error: rpc error: code = DeadlineExceeded desc = context deadline exceeded

PD client set an internal context timeout to every RPC call and the default value is 3 seconds which is too short in some case.

Detail log: https://internal.pingcap.net/idc-jenkins/blue/organizations/jenkins/tikv_ghpr_integration_br_test/detail/tikv_ghpr_integration_br_test/47/pipeline/

We should extend the timeout in PD client.

  1. What version of BR and TiDB/TiKV/PD are you using?

All master version.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions