-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kvserver: deadlock stopping server because of stopper<->lease acquisition cycle #63761
Copy link
Copy link
Closed
Labels
C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.GA-blocker
Description
We appear to have the following deadlock:
The stopper wants to stop; it runs all the closers under s.mu. A closer that was recently added wants to visit all the replicas. Visiting a replica briefly rlocks its r.mu to check if it's been destroyed.
The lease acquisition tries to start a task, and starting tasks wants an rlock on s.mu just to figure out whether the task should be refused
@irfansharif you've added the closer in #61279, so I'll let you figure out who has to give. It seems to me that rejecting new tasks in the stopper can probably be done in a lockfree way. I also wonder whether the closers actually need to run under s.mu - particularly under a write lock. Perhaps we can copy them out of the lock.
Stacks
goroutine 11 [semacquire]:
sync.runtime_SemacquireMutex(0xc00206f994, 0xc002884500, 0x0)
/home/andrei/goroot/src/runtime/sema.go:71 +0x47
sync.(*RWMutex).RLock(...)
/home/andrei/goroot/src/sync/rwmutex.go:50
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*storeReplicaVisitor).Visit(0xc002884540, 0xc0028650b0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/store.go:372 +0x1db
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).VisitReplicas(...)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/store.go:2070
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processRaft.func2()
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/store_raft.go:624 +0x9b
github.com/cockroachdb/cockroach/pkg/util/stop.CloserFn.Close(0xc00218d8c0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/util/stop/stopper.go:110 +0x25
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).Stop(0xc00073ea80, 0x55e39e0, 0xc00021e018)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/util/stop/stopper.go:478 +0x268
github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).stopServerLocked(0xc0007bb200, 0x0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/testutils/testcluster/testcluster.go:165 +0x70
github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).stopServers(0xc0007bb200, 0x55e39e0, 0xc00021e018)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/testutils/testcluster/testcluster.go:116 +0x1cf
github.com/cockroachdb/cockroach/pkg/testutils/testcluster.(*TestCluster).Start.func2()
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/testutils/testcluster/testcluster.go:335 +0x45
github.com/cockroachdb/cockroach/pkg/util/stop.CloserFn.Close(0xc0006d5e40)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/util/stop/stopper.go:110 +0x25
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).Stop(0xc00073e680, 0x55e39e0, 0xc00021e010)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/util/stop/stopper.go:478 +0x268
github.com/cockroachdb/cockroach/pkg/kv/kvserver_test.TestRejectedLeaseDoesntDictateClosedTimestamp(0xc000483080)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/replica_closedts_test.go:680 +0x1b46
testing.tRunner(0xc000483080, 0x4f13230)
/home/andrei/goroot/src/testing/testing.go:1123 +0xef
created by testing.(*T).Run
/home/andrei/goroot/src/testing/testing.go:1168 +0x2b3
goroutine 2083 [semacquire]:
sync.runtime_SemacquireMutex(0xc00073eaa4, 0xc001e62200, 0x0)
/home/andrei/goroot/src/runtime/sema.go:71 +0x47
sync.(*RWMutex).RLock(...)
/home/andrei/goroot/src/sync/rwmutex.go:50
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).runPrelude(0xc00073ea80, 0x4853100)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/util/stop/stopper.go:416 +0xdd
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask(0xc00073ea80, 0x55e39a0, 0xc00297ea00, 0xc0004d8540, 0x35, 0xc001ed66e0, 0x1, 0x1)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/util/stop/stopper.go:339 +0x79
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*pendingLeaseRequest).requestLeaseAsync(0xc00206fa90, 0x55e3a60, 0xc001ebf260, 0x100000001, 0x1676230a00000001, 0x0, 0x1676230ac22faf2a, 0x212, 0x0, 0x100000001, ...)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/replica_range_lease.go:344 +0x3f6
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*pendingLeaseRequest).InitOrJoinRequest(0xc00206fa90, 0x55e3a60, 0xc001ebf260, 0x100000001, 0x1, 0x0, 0x0, 0x0, 0x0, 0x100000001, ...)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/replica_range_lease.go:279 +0x988
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).requestLeaseLocked(0xc00206f600, 0x55e3a60, 0xc001ebf260, 0x0, 0x0, 0x0, 0x100000001, 0x1, 0x0, 0x0, ...)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/replica_range_lease.go:738 +0x3f8
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).redirectOnOrAcquireLeaseForRequest.func1(0xc00206f600, 0x55e3a60, 0xc001ebf260, 0xc0030e53b0, 0x0, 0x0, 0xc0030e53d0, 0xc0021870a0, 0x0, 0x0, ...)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/replica_range_lease.go:1125 +0x305
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).redirectOnOrAcquireLeaseForRequest(0xc00206f600, 0x55e3a60, 0xc001ebf260, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/replica_range_lease.go:1151 +0x365
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).redirectOnOrAcquireLease(...)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/replica_range_lease.go:1056
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).ensureClosedTimestampStarted(0xc00206f600, 0x55e3a60, 0xc001ebf260, 0xc002187098)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/replica_rangefeed.go:671 +0x73
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).RangeFeed(0xc00206f600, 0xc000dd00c0, 0x56371a0, 0xc0006defb0, 0x0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/replica_rangefeed.go:153 +0x194
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).RangeFeed(0xc002042000, 0xc000dd00c0, 0x56371a0, 0xc0006defb0, 0x0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/store.go:2502 +0x11e
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Stores).RangeFeed(0xc000848070, 0xc000dd00c0, 0x56371a0, 0xc0006defb0, 0x10)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/kv/kvserver/stores.go:216 +0xce
github.com/cockroachdb/cockroach/pkg/server.(*Node).RangeFeed(0xc000e88c00, 0xc000dd00c0, 0x56371a0, 0xc0006defb0, 0xc000e88c00, 0xc0020efbf0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/server/node.go:1011 +0x54
github.com/cockroachdb/cockroach/pkg/roachpb._Internal_RangeFeed_Handler(0x47d4860, 0xc000e88c00, 0x562bfe0, 0xc000dd0000, 0x0, 0x0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/roachpb/api.pb.go:8055 +0x10b
github.com/cockroachdb/cockroach/pkg/util/tracing.StreamServerInterceptor.func1(0x47d4860, 0xc000e88c00, 0x562bfe0, 0xc000dd0000, 0xc0010378e0, 0x4f13a98, 0x0, 0x0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/util/tracing/grpc_interceptor.go:169 +0x5a9
google.golang.org/grpc.getChainStreamHandler.func1(0x47d4860, 0xc000e88c00, 0x562bfe0, 0xc000dd0000, 0x455d1c0, 0xc00297e9c0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/vendor/google.golang.org/grpc/server.go:1302 +0xdd
github.com/cockroachdb/cockroach/pkg/rpc.NewServer.func2(0x47d4860, 0xc000e88c00, 0x562bfe0, 0xc000dd0000, 0xc0010378e0, 0xc00297e9c0, 0xc00297e9c0, 0x2)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/rpc/context.go:182 +0x96
google.golang.org/grpc.getChainStreamHandler.func1(0x47d4860, 0xc000e88c00, 0x562bfe0, 0xc000dd0000, 0x0, 0x0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/vendor/google.golang.org/grpc/server.go:1302 +0xdd
github.com/cockroachdb/cockroach/pkg/rpc.kvAuth.streamInterceptor(0x47d4860, 0xc000e88c00, 0x562bfe0, 0xc000dd0000, 0xc0010378e0, 0xc00297e980, 0x455d1c0, 0xc00297e980)
/home/andrei/src/github.com/cockroachdb/cockroach-2/pkg/rpc/auth.go:86 +0xa8
google.golang.org/grpc.chainStreamServerInterceptors.func1(0x47d4860, 0xc000e88c00, 0x562bfe0, 0xc000dd0000, 0xc0010378e0, 0x4f13a98, 0x55e3a60, 0xc001ebf110)
/home/andrei/src/github.com/cockroachdb/cockroach-2/vendor/google.golang.org/grpc/server.go:1288 +0xbd
google.golang.org/grpc.(*Server).processStreamingRPC(0xc000b648c0, 0x564d7c0, 0xc001c88d80, 0xc001bb0b00, 0xc000ead050, 0x74d89c0, 0x0, 0x0, 0x0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/vendor/google.golang.org/grpc/server.go:1434 +0x522
google.golang.org/grpc.(*Server).handleStream(0xc000b648c0, 0x564d7c0, 0xc001c88d80, 0xc001bb0b00, 0x0)
/home/andrei/src/github.com/cockroachdb/cockroach-2/vendor/google.golang.org/grpc/server.go:1507 +0xc9c
google.golang.org/grpc.(*Server).serveStreams.func1.2(0xc0037e5130, 0xc000b648c0, 0x564d7c0, 0xc001c88d80, 0xc001bb0b00)
/home/andrei/src/github.com/cockroachdb/cockroach-2/vendor/google.golang.org/grpc/server.go:843 +0xa5
created by google.golang.org/grpc.(*Server).serveStreams.func1
/home/andrei/src/github.com/cockroachdb/cockroach-2/vendor/google.golang.org/grpc/server.go:841 +0x1fd
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.GA-blocker