Migrate successor field from sideband to dedicated table#5024
Migrate successor field from sideband to dedicated table#5024pwojcikdev merged 7 commits intonanocurrency:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR migrates the successor field from the block sideband data structure into a dedicated database table as phase 1 of the block table restructuring initiative (#4053). This architectural change makes block records immutable once written, eliminating read-modify-write cycles during block insertion, and paves the way for future optimizations that will enable 50% disk space savings through MDB_APPEND insertion mode.
Changes:
- Introduced a new
successordatabase table and correspondingsuccessor_viewAPI for managing block successor relationships - Removed the successor field from
block_sidebandstructure and updated all constructor calls throughout the codebase - Implemented database migration (
upgrade_v24_to_v25) that rewrites all blocks to extract successor data into the new table
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| nano/store/tables.hpp | Added successor table enum entry |
| nano/store/ledger_store.hpp | Added successor_view member and incremented version to 25 |
| nano/store/ledger_store.cpp | Implemented v24 to v25 migration logic with batch processing and crash recovery |
| nano/store/ledger/successor.hpp | New successor_view API definition with standard CRUD operations |
| nano/store/ledger/successor.cpp | Implementation of successor_view operations delegating to backend |
| nano/store/ledger/block.hpp | Updated block_view constructor to accept successor_view reference |
| nano/store/ledger/block.cpp | Refactored successor operations to use dedicated successor table |
| nano/store/fwd.hpp | Added forward declaration for successor_view |
| nano/store/CMakeLists.txt | Added successor.hpp and successor.cpp to build |
| nano/secure/ledger_processor.cpp | Updated block_sideband constructor calls to remove successor parameter |
| nano/secure/common.cpp | Updated genesis block sideband initialization |
| nano/node/json_handler.cpp | Changed successor access from sideband to successor table lookup |
| nano/node/bootstrap/bootstrap_server.cpp | Updated successor iteration logic to use new API |
| nano/lib/block_sideband.hpp | Removed successor field from block_sideband structure |
| nano/lib/block_sideband.cpp | Removed successor serialization and updated size calculations |
| nano/core_test/migrations.cpp | Removed successor parameter from test sideband construction |
| nano/core_test/ledger_upgrades.cpp | Added comprehensive test for v24 to v25 migration |
| nano/core_test/block_store.cpp | Updated sideband serialization and successor_clear tests |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
This looks good. The only sketchy part is the manual migration by copying buffers. A better approach would be to freeze block sideband class as it was in previous version and use that to deserialize > reserialize with newest version, similar to how it was done for: |
And added progress indicator
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
30dc91b to
907b8b1
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Populate dedicated successor table from block sideband data | ||
| void ledger_store::upgrade_v24_to_v25 () | ||
| { | ||
| logger.info (nano::log::type::ledger_upgrade, "Upgrading database from v24 to v25..."); | ||
|
|
||
| // Open with schema_current so we have access to the successor table | ||
| backend.open (schema_current, nano::store::open_mode::read_write); | ||
| { | ||
| release_assert (backend.get_version (backend.tx_begin_read ()) == 24, "unexpected version during upgrade", std::to_string (backend.get_version (backend.tx_begin_read ()))); | ||
|
|
||
| // Always clear successor table to ensure clean state before populating | ||
| auto clear_result = backend.clear (nano::store::table::successor); | ||
| release_assert (backend.success (clear_result), "failed to clear successor table during upgrade", backend.error_string (clear_result)); | ||
|
|
||
| auto transaction = backend.tx_begin_write (); | ||
|
|
||
| // Smaller batch size for dev runs to potentially trigger edge cases | ||
| const size_t batch_size = nano::is_dev_run () ? 2 : 250000; | ||
| size_t processed = 0; | ||
| auto const total_blocks = backend.count (backend.tx_begin_read (), nano::store::table::blocks); | ||
|
|
||
| // Iterate all blocks using a separate read transaction | ||
| auto iterate_blocks = [this] (auto && func) { | ||
| auto read_txn = backend.tx_begin_read (); | ||
| auto it = nano::store::typed_iterator<nano::block_hash, nano::store::block_w_sideband>{ backend.begin (read_txn, nano::store::table::blocks) }; | ||
| auto const end = nano::store::typed_iterator<nano::block_hash, nano::store::block_w_sideband>{ backend.end (read_txn, nano::store::table::blocks) }; | ||
| for (; it != end; ++it) | ||
| { | ||
| auto const & [hash, block_w_sideband] = *it; | ||
| func (hash, block_w_sideband); | ||
| } | ||
| }; | ||
|
|
||
| iterate_blocks ([this, &transaction, &processed, batch_size, total_blocks] (nano::block_hash const & hash, nano::store::block_w_sideband const & block_w_sideband) { | ||
| // If successor is non-zero, write to successor table | ||
| if (!block_w_sideband.sideband.successor.is_zero ()) | ||
| { | ||
| auto status = backend.put (transaction, nano::store::table::successor, hash, block_w_sideband.sideband.successor); | ||
| backend.release_assert_success (status); | ||
| } | ||
|
|
||
| processed++; | ||
| if (processed % batch_size == 0) | ||
| { | ||
| double const percentage = total_blocks > 0 ? (static_cast<double> (processed) / total_blocks * 100.0) : 0.0; | ||
| logger.info (nano::log::type::ledger_upgrade, "Processed {} blocks ({:.1f}%)", processed, percentage); | ||
| transaction.refresh (); | ||
| } | ||
| }); | ||
|
|
||
| logger.info (nano::log::type::ledger_upgrade, "Done processing {} blocks", processed); | ||
| version.put (transaction, 25); | ||
| } |
There was a problem hiding this comment.
upgrade_v24_to_v25 currently only copies successor hashes into the new table, but does not rewrite the blocks records to remove/zero the successor bytes in the sideband. This leaves block_sideband.successor populated for legacy blocks (and always zero for new blocks after this change), making the field’s semantics inconsistent and diverging from the stated goal of migrating successor out of the block record. Either update the migration to also strip/clear the successor field in stored block values, or adjust the version bump/upgrade description to reflect that the on-disk block format is unchanged in v25.
| pending, | ||
| pruned, | ||
| successor, | ||
| vote, |
There was a problem hiding this comment.
Adding table::successor requires updating backend table handling that switches over nano::store::table (e.g. RocksDB count_is_exact currently doesn’t include this new enum value). If NANO_WARN_TO_ERR/-Wswitch is enabled this can break builds, and at runtime it may cause count() behavior for the successor table to differ from other tables. Please update the relevant switch statements to explicitly handle table::successor.
| { | ||
| double const percentage = total_blocks > 0 ? (static_cast<double> (processed) / total_blocks * 100.0) : 0.0; | ||
| logger.info (nano::log::type::ledger_upgrade, "Processed {} blocks ({:.1f}%)", processed, percentage); | ||
| transaction.refresh (); |
There was a problem hiding this comment.
I previously ran into issues with transaction.refresh() and LMDB when inserting large amounts of data. When using refresh, the affected tables consistently ended up consuming significantly more disk space than when inserting the same data without refreshing the transaction.
Because of that, I wanted to ask: after the successor upgrade, does the resulting ledger size roughly match the expected size on your end, or did you also notice a few extra gigabytes of overhead?
I initially suspected LMDB page splits caused by inserting unsorted keys, since LMDB needs to split pages more often in that case. This should not apply here, because the successor inserts block hashes in sorted order. I just wanted to double-check that assumption and make sure no unexpected LMDB behavior is at play here.
There was a problem hiding this comment.
I noticed the same issue when working on a separate block sideband upgrade (adding a topological index to speed up bootstrap). Because the write pattern was effectively random, it was both slow and caused the ledger to grow to ~200GB.
@RickiNano could you run it against a copy of the current ledger and let us know how much the ledger grows afterward?
Even if we run the upgrade with LMDB in append mode, I suspect the DB will still grow over time once normal operation resumes, since subsequent inserts may not be append-only.
If the numbers look reasonable, we should merge it.
There was a problem hiding this comment.
@pwojcikdev
I just upgraded a recent LMDB production ledger now using this branch.
215345749 blocks in ledger.
Upgrade time: 15 minutes
Size before upgrade 127.860.494.336
Size after upgrade 141.541.904.384 byte
The new successor table adds 32 + 32 bytes for each block (but only if it has a successor).
32 bytes can be removed from each block in the sideband (even if they don't have a successor). That is about 7 GB.
When/if block split is introduced the total size will be reduced by approx 50% for LMDB
907b8b1 to
13d5fa3
Compare
This PR moves the successor field from the sideband data and into a new database table named successors. It is a rewrite of #4967
It extracts the successor field from block_sideband into a new successor database table (block_hash -> successor_hash)
Block records are now immutable once written, eliminating predecessor read-modify-write cycles on every block insertion
This is phase 1 of the block table restructuring described in #4053 . It is a prerequisite for the full block table split which enables MDB_APPEND insertion giving 50% disk space savings
Migration (upgrade_v24_to_v25)
Rewrites every block entry in the database in a single pass:
I have tested with both LMDB and RocksDB
Claude code has been assisting in writing the code