cmd/tomo: Add 'repair-snapshot' command to fix 'Masternodes lists are different in checkpoint header & snapshot' error#499
Conversation
|
Thanks for implementing this fix. However, I find that the description is a little bit confusing. From what I learn from the code in this PR, and the point of the fix. I think this sentence should change a bit:
to
This fix cannot Running this command with a node built with go1.19+ can't fix the problem, hence |
|
Thank you for the feedback! I've updated the description to:
|
… different in checkpoint header & snapshot' error (BuildOnViction#499) * feat: add db repair-snapshot command * fix: make datadir flag required * fix: limit first 150 masternode * refactor: use state to retrieve getCandidateCap instead of making a contract call * chore: rename confusing variable names
Description
This pull request introduces a new
dbcommand with arepair-snapshotsubcommand to resolve inconsistencies in the masternode list that can occur during updates. The issue arises from the use ofsort.Sliceto sort the masternode list when masternodes have identical stake amounts. This sorting behavior is unstable across different Go versions, leading to potential discrepancies.When a new masternode list is updated (e.g., at gap 895), the list is initially stored in the cache but is not immediately saved to the database. If the node crashes during this period, the updated masternode list is lost. Although the
verifyCascadingFieldsmechanism provides some level of recovery, it cannot fully resolve the issue because the root cause lies in the version-dependent behavior ofsort.Slice. This can result in mismatched signer lists when sorting occurs.Even in scenarios where the node does not crash, there remains a risk of mismatched signer lists between nodes running different Go versions due to inconsistencies in the sorting behavior.
Introduces a robust solution to repair snapshots using a compatible Go version without having to restore datadir from a snapshot.
Changes
dbcommand, setting the foundation for future subcommands (e.g.,inspect-db, etc.).repair-snapshotsubcommand to fix the sorting bug caused bysort.Slice. This subcommand recalculates and updates the masternode list to ensure it aligns across nodes.Usage