Skip to content

stability: panic on version discrepancy #9347

@mberhault

Description

@mberhault

Upgrading gamma from d8213c2 to d262ae3 using a rolling restart.

Nodes crashed right away with:

panic: interface conversion: interface is nil, not roachpb.Request [recovered]
        panic: interface conversion: interface is nil, not roachpb.Request

goroutine 45 [running]:
panic(0x15a5780, 0xc421a466c0)
        /usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/cockroachdb/cockroach/cli.initBacktrace.func2(0x15a5780, 0xc421a466c0)
        /go/src/github.com/cockroachdb/cockroach/cli/backtrace.go:98 +0xca
github.com/cockroachdb/cockroach/util/stop.(*Stopper).Recover(0xc42040d560)
        /go/src/github.com/cockroachdb/cockroach/util/stop/stopper.go:173 +0x56
panic(0x15a5780, 0xc421a466c0)
        /usr/local/go/src/runtime/panic.go:458 +0x243
github.com/cockroachdb/cockroach/roachpb.RequestUnion.GetInner(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/cockroachdb/cockroach/roachpb/api.go:325 +0x6f
github.com/cockroachdb/cockroach/roachpb.(*BatchRequest).GetArg(0xc4217a33b8, 0x16, 0xc421dfe540, 0x1, 0x1)
        /go/src/github.com/cockroachdb/cockroach/roachpb/batch.go:139 +0xdd
github.com/cockroachdb/cockroach/roachpb.(*BatchRequest).IsLease(0xc4217a33b8, 0xc4203b9380)
        /go/src/github.com/cockroachdb/cockroach/roachpb/batch.go:78 +0x3e
github.com/cockroachdb/cockroach/storage.(*Replica).processRaftCommand(0xc4203a26c0, 0xc421152970, 0x8, 0x20831, 0x15804, 0x100000001, 0x1, 0x1473ff8ab84d635e, 0x0, 0x0, ...)
        /go/src/github.com/cockroachdb/cockroach/storage/replica.go:2357 +0x92a
github.com/cockroachdb/cockroach/storage.(*Replica).handleRaftReady(0xc4203a26c0, 0x0, 0x0)
        /go/src/github.com/cockroachdb/cockroach/storage/replica.go:1894 +0x8b8
github.com/cockroachdb/cockroach/storage.(*Store).processReady(0xc4201c6280, 0x15804)
        /go/src/github.com/cockroachdb/cockroach/storage/store.go:2571 +0xfd
github.com/cockroachdb/cockroach/storage.(*raftScheduler).worker(0xc42031c620, 0xc42040d560)
        /go/src/github.com/cockroachdb/cockroach/storage/scheduler.go:197 +0x308
github.com/cockroachdb/cockroach/storage.(*raftScheduler).Start.func1()
        /go/src/github.com/cockroachdb/cockroach/storage/scheduler.go:160 +0x33
github.com/cockroachdb/cockroach/util/stop.(*Stopper).RunWorker.func1(0xc42040d560, 0xc42074aee0)
        /go/src/github.com/cockroachdb/cockroach/util/stop/stopper.go:187 +0x7d
created by github.com/cockroachdb/cockroach/util/stop.(*Stopper).RunWorker
        /go/src/github.com/cockroachdb/cockroach/util/stop/stopper.go:188 +0x66

This is most likely due to 5e1ce5a.

After the first node (104.196.169.188) started crash-looping, stopped all of them and restarted at the new version. However, other nodes started crashing. Reverted back to d8213c2

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions