Skip to content

lightning stuck in writeToTiKV after inject tikv-failure for 60 minutes #46321

@lance6716

Description

@lance6716

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

inject chaos error

cases:
  - name: tikv_failure
    faultType: failure
    selector: tikv(random)
    warmUpTime: 1m
    faultDuration: 60m
    recoveryTime: 5m

2. What did you expect to see? (Required)

3. What did you see instead (Required)

goroutine 11915 [select, 421 minutes]:
google.golang.org/grpc/internal/transport.(*http2Client).NewStream(0xc004e638c0, {0x59c7d10, 0xc03ca11f20}, 0xc03e701aa0)
	/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/http2_client.go:829 +0x5a5
google.golang.org/grpc.(*csAttempt).newStream(0xc03cf4fee0)
	/go/pkg/mod/google.golang.org/grpc@v1.54.0/stream.go:486 +0xb2
google.golang.org/grpc.newClientStreamWithParams.func2(0xc03cf4fee0)
	/go/pkg/mod/google.golang.org/grpc@v1.54.0/stream.go:336 +0x3a
google.golang.org/grpc.(*clientStream).withRetry(0xc03e46bb00, 0xc03e7183e0, 0xc03bb32998)
	/go/pkg/mod/google.golang.org/grpc@v1.54.0/stream.go:760 +0x144
google.golang.org/grpc.newClientStreamWithParams({0x59c7c68, 0xc0135e2460}, 0x7bf00e0, 0xc001451180, {0x519c087, 0x1d}, {0x0, 0x0, 0x0, 0x0, ...}, ...)
	/go/pkg/mod/google.golang.org/grpc@v1.54.0/stream.go:345 +0xc76
google.golang.org/grpc.newClientStream.func2({0x59c7c68?, 0xc0135e2460?}, 0xc03baaace0?)
	/go/pkg/mod/google.golang.org/grpc@v1.54.0/stream.go:202 +0x9f
google.golang.org/grpc.newClientStream({0x59c7c68, 0xc0135e2460}, 0x7bf00e0, 0xc001451180, {0x519c087, 0x1d}, {0x0, 0x0, 0x0})
	/go/pkg/mod/google.golang.org/grpc@v1.54.0/stream.go:237 +0x674
google.golang.org/grpc.(*ClientConn).NewStream(0x18ae6c7?, {0x59c7c68?, 0xc0135e2460?}, 0xc03bb32c80?, {0x519c087?, 0x3998efd?}, {0x0?, 0x59c7c68?, 0xc0135e2460?})
	/go/pkg/mod/google.golang.org/grpc@v1.54.0/stream.go:162 +0x1ae
github.com/pingcap/kvproto/pkg/import_sstpb.(*importSSTClient).Write(0xf3457203a36cfba0?, {0x59c7c68?, 0xc0135e2460?}, {0x0?, 0x31?, 0x40?})
	/go/pkg/mod/github.com/pingcap/kvproto@v0.0.0-20230523065550-8b641fa69bf3/pkg/import_sstpb/import_sstpb.pb.go:2714 +0x6d
github.com/pingcap/tidb/br/pkg/lightning/backend/local.(*Backend).writeToTiKV(0xc0014a6180, {0x59c7c68, 0xc0135e2460}, 0xc0310ccc00)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/lightning/backend/local/region_job.go:231 +0xb5c
github.com/pingcap/tidb/br/pkg/lightning/backend/local.(*Backend).executeJob(0xc0014a6180, {0x59c7c68, 0xc0135e2460}, 0xc0310ccc00)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/lightning/backend/local/local.go:1357 +0xb4
github.com/pingcap/tidb/br/pkg/lightning/backend/local.(*Backend).startWorker(0x4b1b6c0?, {0x59c7c68, 0xc0135e2460}, 0xc0006bc7e0, 0x0?, 0x0?)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/lightning/backend/local/local.go:1271 +0x125
github.com/pingcap/tidb/br/pkg/lightning/backend/local.(*Backend).doImport.func3()
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/lightning/backend/local/local.go:1547 +0x31
golang.org/x/sync/errgroup.(*Group).Go.func1()
	/go/pkg/mod/golang.org/x/sync@v0.2.0/errgroup/errgroup.go:75 +0x64
created by golang.org/x/sync/errgroup.(*Group).Go
	/go/pkg/mod/golang.org/x/sync@v0.2.0/errgroup/errgroup.go:72 +0xa5

but in fact the TiKV has joined the cluster after injected 60 minutes failure.

lightning-0-goroutine.txt

4. What is your TiDB version? (Required)

Metadata

Metadata

Assignees

No one assigned

    Labels

    affects-5.3This bug affects 5.3.x versions.affects-5.4This bug affects the 5.4.x(LTS) versions.affects-6.1This bug affects the 6.1.x(LTS) versions.affects-6.5This bug affects the 6.5.x(LTS) versions.affects-7.1This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.component/lightningThis issue is related to Lightning of TiDB.severity/majortype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions