-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage: splitPostApply can see tombstone for RHS #40470
Copy link
Copy link
Closed
Labels
A-disaster-recoveryS-2-temp-unavailabilityTemp crashes or other availability problems. Can be worked around or resolved by restarting.Temp crashes or other availability problems. Can be worked around or resolved by restarting.
Description
Describe the problem
node fails with
F190904 14:33:35.276986 138 storage/store.go:2148 [n1,s1,r313/4:/Table/58/1/{2525/9…-3750/1…}] split trigger found right-hand side with tombstone {NextReplicaID:5}: [n1,s1,r316/?:{-}]
To Reproduce
roachprod create knz-gce -u knz -c gce --geo -n 10roachprod start $CLUSTER:1,4,7-8roachprod run $CLUSTER:1 -- "./cockroach workload fixtures import tpcc --warehouses=5000 --db=tpcc --experimental-direct-ingestion"
This fails within 1-2 minutes.
Relevant log lines:
I190904 14:33:33.261425 180 server/status/runtime.go:498 [n1] runtime stats: 4.7 GiB RSS, 407 goroutines, 2.6 GiB/718 MiB/3.4 GiB GO alloc/idle/total, 1.1 GiB/1.3 GiB CGO alloc/total, 210805.4 CGO/sec, 381.5/16.5 %(u/s)time, 0.0 %gc (3x),
223 MiB/85 MiB (r/w)net
I190904 14:33:33.964062 11702 storage/replica_raft.go:291 [n1,s1,r261/1:/Table/56{-/1}] proposing REMOVE_REPLICA[(n4,s4):3]: after=[(n1,s1):1 (n3,s3):2 (n2,s2):5] next=6
W190904 14:33:34.137913 11847 storage/replica_raft.go:105 [n1,s1,r323/1:/Table/60/1/2{633/2/…-766/4/…}] context canceled before proposing: 1 HeartbeatTxn
I190904 14:33:34.489669 12200 storage/replica_command.go:1521 [n1,replicate,s1,r247/1:/{Table/61/3-Max}] change replicas (add [] remove [(n3,s3):2]): existing descriptor r247:/{Table/61/3-Max} [(n1,s1):1, (n3,s3):2, (n2,s2):3, (n4,s4):5,
next=6, gen=20]
I190904 14:33:34.669612 11610 storage/replica_raftstorage.go:793 [n1,s1,r313/4:{-}] applying LEARNER snapshot [id=faf20096 index=15]
I190904 14:33:34.944105 11610 storage/replica_raftstorage.go:814 [n1,s1,r313/4:/Table/58/1/{2525/9…-3750/1…}] applied LEARNER snapshot [total=274ms ingestion=4@217ms id=faf20096 index=15]
I190904 14:33:35.053208 12088 storage/split_queue.go:149 [n1,split,s1,r307/1:/Table/54/1/125{3/119…-5/522…}] split saw concurrent descriptor modification; maybe retrying
W190904 14:33:35.202730 12090 storage/replica_raft.go:105 [n1,s1,r304/1:/Table/53/1/250{3/7/-…-5/3/-…}] context canceled before proposing: 1 HeartbeatTxn
I190904 14:33:35.232699 12190 storage/replica_command.go:395 [n1,split,s1,r307/1:/Table/54/1/125{3/119…-5/522…}] initiating a split of this range at key /Table/54/1/1254/11837 [r329] (77 MiB above threshold size 64 MiB)
F190904 14:33:35.276986 138 storage/store.go:2148 [n1,s1,r313/4:/Table/58/1/{2525/9…-3750/1…}] split trigger found right-hand side with tombstone {NextReplicaID:5}: [n1,s1,r316/?:{-}]
goroutine 138 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0xc000448301, 0xc000448360, 0x0, 0x7c662a)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:1016 +0xb1
github.com/cockroachdb/cockroach/pkg/util/log.(*loggingT).outputLogEntry(0x7c04a40, 0xc000000004, 0x73d2595, 0x10, 0x864, 0xc0006e43c0, 0x89)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:874 +0x93e
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x4e5e460, 0xc03ccc99e0, 0x4, 0x2, 0x4563081, 0x3a, 0xc003394730, 0x2, 0x2)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:66 +0x2cc
github.com/cockroachdb/cockroach/pkg/util/log.logDepth(0x4e5e460, 0xc03ccc99e0, 0x1, 0xc000000004, 0x4563081, 0x3a, 0xc003394730, 0x2, 0x2)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:69 +0x8c
github.com/cockroachdb/cockroach/pkg/util/log.Fatalf(...)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:180
github.com/cockroachdb/cockroach/pkg/storage.splitPostApply(0x4e5e460, 0xc03ccc99e0, 0x0, 0x15c142ce810fb293, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/store.go:2148 +0xb3e
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).handleSplitResult(...)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica_application_result.go:233
github.com/cockroachdb/cockroach/pkg/storage.(*replicaStateMachine).handleNonTrivialReplicatedEvalResult(0xc003e298c0, 0x4e5e460, 0xc03ccc99e0, 0x0, 0x0, 0xc0003e1e40, 0x0, 0x0, 0x0, 0x0, ...)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica_application_state_machine.go:943 +0x852
github.com/cockroachdb/cockroach/pkg/storage.(*replicaStateMachine).ApplySideEffects(0xc003e298c0, 0x4eacee0, 0xc063842008, 0x0, 0x0, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica_application_state_machine.go:856 +0x72d
github.com/cockroachdb/cockroach/pkg/storage/apply.mapCheckedCmdIter(0x7f378dc4b0c8, 0xc003e29ad8, 0xc0033953d8, 0x0, 0x0, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/apply/cmd.go:182 +0x11b
github.com/cockroachdb/cockroach/pkg/storage/apply.(*Task).applyOneBatch(0xc003395800, 0x4e5e460, 0xc03ccc99e0, 0x4eacfa0, 0xc003e29a78, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/apply/task.go:276 +0x228
github.com/cockroachdb/cockroach/pkg/storage/apply.(*Task).ApplyCommittedEntries(0xc003395800, 0x4e5e460, 0xc03ccc99e0, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/apply/task.go:242 +0xcf
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).handleRaftReadyRaftMuLocked(0xc003e29800, 0x4e5e460, 0xc03ccc99e0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica_raft.go:759 +0xd87
github.com/cockroachdb/cockroach/pkg/storage.(*Store).processRequestQueue.func1(0x4e5e460, 0xc03ccc99e0, 0xc003e29800, 0x4e5e460)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/store.go:3599 +0x131
github.com/cockroachdb/cockroach/pkg/storage.(*Store).withReplicaForRequest(0xc000adc000, 0x4e5e460, 0xc03ccc99e0, 0xc009c02200, 0xc074b31e98, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/store.go:3352 +0x150
Expected behavior
Import succeeds
Context
kena@knz-gce-0001:~$ ./cockroach version
Build Tag: v19.2.0-alpha.20190606-2012-g58d0fc3
Build Time: 2019/09/04 11:31:21
Distribution: CCL
Platform: linux amd64 (x86_64-unknown-linux-gnu)
Go Version: go1.12.5
C Compiler: gcc 6.3.0
Build SHA-1: 58d0fc3676726c7fa3ebaf41e99f54f305f25fa0
Build Type: release
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
A-disaster-recoveryS-2-temp-unavailabilityTemp crashes or other availability problems. Can be worked around or resolved by restarting.Temp crashes or other availability problems. Can be worked around or resolved by restarting.