Skip to content

Deadlock in addrbook #2955

@ebuchman

Description

@ebuchman

Seeds halted on gaia-9002 testnet. Stopped responding to RPC and no new log messages. SIGABRT resulted in a large dump, with many routines blocked on accessing the AddrBook, suggesting a deadlock.

It also included this gem:

Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: goroutine 2056 [runnable]:
Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: math/rand.(*Rand).Int31(0xc422610690, 0x6ce96401)
Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: /usr/local/go/src/math/rand/rand.go:96 +0x47
Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: math/rand.(*Rand).Int31n(0xc422610690, 0x40, 0xc400000001)
Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: /usr/local/go/src/math/rand/rand.go:128 +0xb4
Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: math/rand.(*Rand).Intn(0xc422610690, 0x40, 0x1)
Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: /usr/local/go/src/math/rand/rand.go:169 +0x45
Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/libs/common.(*Rand).Intn(0xc420b20370, 0x40, 0x1)
Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: /home/greg/go/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/libs/common/random.go:276 +0x49
Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/p2p/pex.(*addrBook).GetSelectionWithBias(0xc4200ca700, 0x1e, 0x0, 0x0, 0x0)
Nov 30 22:21:27 ip-10-0-0-195 gaiad[3406]: /home/greg/go/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/p2p/pex/addrbook.go:412 +0x2b9

Specifically /usr/local/go/src/math/rand/rand.go:169 +0x45 is a panic! it doesn’t seem to be getting caught, but because we’re not using defer, the mtx in our libs/common/rand isn’t being released, locking up the addrbook

The core of the problem is sdk/vendor/github.com/tendermint/tendermint/p2p/pex/addrbook.go:412 +0x2b9, where we must be passing a 0, which is invalid for the rand methods. for some reason the len(a.bucketsOld) is 0 there, and we’ll need to prevent that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C:p2pComponent: P2P pkgT:bugType Bug (Confirmed)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions