Skip to content
This repository was archived by the owner on Jul 24, 2024. It is now read-only.
This repository was archived by the owner on Jul 24, 2024. It is now read-only.

BR won't clean up the environment when exit by SIGTERM #557

@YuJuncen

Description

@YuJuncen

Please answer these questions before submitting your issue. Thanks!

  1. What did you do?
    If possible, provide a recipe for reproducing the error.
  • start BR (restore or backcup with --remove-schedulers)
  • waiting for the progress bar present, then press ctrl + c
  1. What did you expect to see?
    The cluster config changed by BR should be undone, since SIGTERM allows us to gracefully stop.

  2. What did you see instead?
    The cluster has stuck in the config that BR has set. (For current master, PD schedulers could be reset due to scheduler: use pause instead of remove schedulers #551 )

image

  1. What version of BR and TiDB/TiKV/PD are you using?

v4.0.7

Note:

We listen to signals here:

br/main.go

Lines 34 to 39 in d2d5bba

case syscall.SIGTERM:
cancel()
os.Exit(0)
default:
cancel()
os.Exit(1)

Canceling the context could make other goroutines eventually exit and clean up, but we leave no time for them.

Add a time.Sleep(30 * time.Second) remove those os.Exits could help. But there are still some problems:

br/pkg/task/backup.go

Lines 222 to 227 in d2d5bba

restore, e := mgr.RemoveSchedulers(ctx)
defer func() {
if restoreE := restore(ctx); restoreE != nil {
log.Warn("failed to restore removed schedulers, you may need to restore them manually", zap.Error(restoreE))
}
}()

We use the global context to do the cleanup tasks, which will always fail if the outer context is canceled. We should change it to a new context with a timeout, the timeout could be the same as the sleep time before stopping.

Metadata

Metadata

Assignees

Labels

Priority/P0Top priority issue. Must have an associated milestonedifficulty/1-easyEasy issuetype/bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions