Skip to content

[R4R]fix deadlock when failed to verify state root#834

Merged
unclezoro merged 1 commit intobnb-chain:developfrom
unclezoro:deadlockOnPipe
Apr 2, 2022
Merged

[R4R]fix deadlock when failed to verify state root#834
unclezoro merged 1 commit intobnb-chain:developfrom
unclezoro:deadlockOnPipe

Conversation

@unclezoro
Copy link
Copy Markdown
Contributor

Description

Fix a node stuck issue when pipecommit is enabled.

Rationale

The node is just stuck and does not import blocks anymore.

The routine profile is: deadlock_routines.log

goroutine 56646460 [chan receive, 1207 minutes]:
github.com/ethereum/go-ethereum/core/state/snapshot.(*diffLayer).WaitAndGetVerifyRes(0xc04f35afd0)
	github.com/ethereum/go-ethereum/core/state/snapshot/difflayer.go:271 +0x34
github.com/ethereum/go-ethereum/core/state.(*StateDB).WaitPipeVerification(0x12d773a474e4ba1)
	github.com/ethereum/go-ethereum/core/state/statedb.go:907 +0x2d
github.com/ethereum/go-ethereum/core.(*BlockValidator).ValidateState.func3()
	github.com/ethereum/go-ethereum/core/block_validator.go:140 +0x38
github.com/ethereum/go-ethereum/core.(*BlockValidator).ValidateState.func5()
	github.com/ethereum/go-ethereum/core/block_validator.go:161 +0x29
created by github.com/ethereum/go-ethereum/core.(*BlockValidator).ValidateState
	github.com/ethereum/go-ethereum/core/block_validator.go:160 +0x40f
goroutine 56646352 [semacquire, 1207 minutes]:
sync.runtime_SemacquireMutex(0x4, 0x40, 0x2766e4c3a1e139fd)
	runtime/sema.go:71 +0x25
sync.(*Mutex).lockSlow(0xc018f0a2e8)
	sync/mutex.go:138 +0x165
sync.(*Mutex).Lock(...)
	sync/mutex.go:81
sync.(*RWMutex).Lock(0x1655b6a81cfca60)
	sync/rwmutex.go:111 +0x36
github.com/ethereum/go-ethereum/core.(*BlockChain).tryRewindBadBlocks(0xc018f0a000)
	github.com/ethereum/go-ethereum/core/blockchain.go:594 +0x5e
github.com/ethereum/go-ethereum/core/state.(*StateDB).Commit.func1()
	github.com/ethereum/go-ethereum/core/state/statedb.go:1421 +0x92
created by github.com/ethereum/go-ethereum/core/state.(*StateDB).Commit
	github.com/ethereum/go-ethereum/core/state/statedb.go:1494 +0x375

The above two routines have deadlock issues:

  1. When the parent block failed to pass state root verification, it will try to rewind the block, which requires acquiring the mux of blockchain.
  2. The child block already acquires the mux of the blockchain, while it is waiting its parent to close the verify channel.

Example

No

Changes

No

@unclezoro unclezoro merged commit f5a1c07 into bnb-chain:develop Apr 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants