Skip to content

miner: fix deadlock and panic issues in block production#1639

Merged
pratikspatil024 merged 1 commit intov2.2.9-candidatefrom
fix_block_miner
Jul 16, 2025
Merged

miner: fix deadlock and panic issues in block production#1639
pratikspatil024 merged 1 commit intov2.2.9-candidatefrom
fix_block_miner

Conversation

@cffls
Copy link
Copy Markdown
Contributor

@cffls cffls commented Jul 16, 2025

Description

These fixes address production issues where mining nodes would deadlock for hours after milestone-triggered reorgs, unable to process new blocks or respond to chain updates.

  • Skip stale sealed blocks that are behind current chain head to prevent resultLoop from attempting to write outdated blocks after reorgs
  • Add 1-second timeout to chDeps channel send to prevent indefinite blocking when receiver is dead or channel is full
  • Return error when transaction count exceeds dependency list length to prevent array index out of bounds panic

Changes

  • Bugfix (non-breaking change that solves an issue)
  • Hotfix (change that solves an urgent issue, and requires immediate attention)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (change that is not backwards-compatible and/or changes current functionality)
  • Changes only for a subset of nodes

Breaking changes

Please complete this section if any breaking changes have been made, otherwise delete it

Nodes audience

In case this PR includes changes that must be applied only to a subset of nodes, please specify how you handled it (e.g. by adding a flag with a default value...)

Checklist

  • I have added at least 2 reviewer or the whole pos-v1 team
  • I have added sufficient documentation in code
  • I will be resolving comments - if any - by pushing each fix in a separate commit and linking the commit hash in the comment reply
  • Created a task in Jira and informed the team for implementation in Erigon client (if applicable)
  • Includes RPC methods changes, and the Notion documentation has been updated

Cross repository changes

  • This PR requires changes to heimdall
    • In case link the PR here:
  • This PR requires changes to matic-cli
    • In case link the PR here:

Testing

  • I have added unit tests
  • I have added tests to CI
  • I have tested this code manually on local environment
  • I have tested this code manually on remote devnet using express-cli
  • I have tested this code manually on amoy
  • I have created new e2e tests into express-cli

Manual tests

Please complete this section with the steps you performed if you ran manual tests for this functionality, otherwise delete it

Additional comments

Please post additional comments in this section if you have them, otherwise delete it

  - Skip stale sealed blocks that are behind current chain head to prevent
    resultLoop from attempting to write outdated blocks after reorgs
  - Add 1-second timeout to chDeps channel send to prevent indefinite
    blocking when receiver is dead or channel is full
  - Return error when transaction count exceeds dependency list length
    to prevent array index out of bounds panic

  These fixes address production issues where mining nodes would deadlock
  for hours after milestone-triggered reorgs, unable to
  process new blocks or respond to chain updates.
@cffls cffls requested a review from a team July 16, 2025 03:05
@pratikspatil024 pratikspatil024 merged commit 95d00c9 into v2.2.9-candidate Jul 16, 2025
7 of 8 checks passed
@pratikspatil024 pratikspatil024 deleted the fix_block_miner branch July 16, 2025 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants