You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Witness payloads are stored once per block (~every 2s) and can exceed 10MB. Storing them in Pebble increases write amplification, flush/compaction pressure, and contention on a resource shared by all chain data. Moving witness blobs to the filesystem reduces DB I/O while preserving existing behavior.
Introduces a WitnessStore interface that abstracts witness blob storage, with two backends: database (existing behavior) and filesystem (via --witness.filestore flag)
The filesystem backend writes one file per block using atomic temp+rename, 2-level hex directory sharding, and keeps size metadata in Pebble for P2P queries
Reads fall back to Pebble transparently, enabling seamless migration without downtime. Old witnesses are naturally pruned away within 6400 blocks
Witness file path layout
<chaindata>/witnesses/<hex[0:2]>/<hex[2:4]>/0x<block-hash>.rlp
Witnesses are stored under a witnesses/ subdirectory within chaindata to keep them co-located with chain data while isolated from Pebble's internal files. A 2-level hex directory sharding (ab/cd/) distributes files across up to 65,536 buckets, avoiding filesystem performance degradation from large flat directories. With ~6400 files retained at any time, each bucket holds fewer than 1 file on average. The block hash is used as the filename since it's the primary key for all witness operations, and the .rlp extension indicates the encoding format.
Transition behaviour
Pebble to Filesystem (enable --witness.filestore): New witnesses go to the filesystem. Old witnesses in Pebble remain accessible via the read fallback and are naturally pruned within 6400 blocks. No manual intervention needed.
Filesystem to Pebble (disable --witness.filestore): New witnesses go to Pebble. Filesystem witnesses become orphaned, that is, the DB backend does not read or delete them. A warning is logged at startup: Witness directory exists but --witness.filestore is disabled. Orphaned files can be removed manually with rm -rf <chaindata>/witnesses/.
Filesystem to Pebble to Filesystem (round-trip): When switching back to filesystem mode, a small number of orphaned files from the first filesystem period may persist. The pruner's cursor advanced during the Pebble period and will not revisit those earlier block heights. These files are harmless and can be ignored or removed manually.
Benchmark tests results
Write (the main operation this optimizes)
Size
DB (Pebble on disk)
FS
FS advantage
100KB
~1ms, 3.3KB alloc
~200μs, 2.5KB alloc
~5x faster, similar allocs
1MB
~12ms, 2.3MB alloc
~320μs, 2.5KB alloc
~38x faster, ~900x less memory
5MB
~48ms, 10.9MB alloc
~1.6ms, 2.5KB alloc
~30x faster, ~4400x less memory
The DB backend's allocations scale with payload size due to Pebble's internal memtable copies and compaction buffers, while latency is dominated by flush/compaction overhead. The FS backend maintains a constant ~2.5KB allocation regardless of payload size.
Read
Size
DB (Pebble on disk)
FS
100KB
~3.8μs
~29μs
1MB
~120μs
~122μs
5MB
~200μs
~279μs
At realistic witness sizes (1–10MB), read performance is nearly equivalent between the two backends. Hot reads are served by the in-memory witnessCache LRU and rarely hit disk.
Delete
Backend
Latency
DB (Pebble on disk)
~2μs
FS
~100μs
FS deletes are slower (syscall vs tombstone marker), but pruning runs in the background every 120s on old blocks and is not latency-sensitive.
Changes
Bugfix (non-breaking change that solves an issue)
Hotfix (change that solves an urgent issue, and requires immediate attention)
New feature (non-breaking change that adds functionality)
Breaking change (change that is not backwards-compatible and/or changes current functionality)
Changes only for a subset of nodes
Breaking changes
Please complete this section if any breaking changes have been made, otherwise delete it
Nodes audience
In case this PR includes changes that must be applied only to a subset of nodes, please specify how you handled it (e.g. by adding a flag with a default value...)
Checklist
I have added at least 2 reviewer or the whole pos-v1 team
I have added sufficient documentation in code
I will be resolving comments - if any - by pushing each fix in a separate commit and linking the commit hash in the comment reply
Created a task in Jira and informed the team for implementation in Erigon client (if applicable)
Includes RPC methods changes, and the Notion documentation has been updated
Cross repository changes
This PR requires changes to heimdall
In case link the PR here:
This PR requires changes to matic-cli
In case link the PR here:
Testing
I have added unit tests
I have added tests to CI
I have tested this code manually on local environment
I have tested this code manually on remote devnet using express-cli
I have tested this code manually on amoy
I have created new e2e tests into express-cli
Manual tests
Please complete this section with the steps you performed if you ran manual tests for this functionality, otherwise delete it
Additional comments
Please post additional comments in this section if you have them, otherwise delete it
❌ Patch coverage is 97.95918% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 51.39%. Comparing base (46e2631) to head (79a3058). ⚠️ Report is 15 commits behind head on develop.
This PR introduces a well-designed abstraction for witness storage with two backends: the existing Pebble database and a new filesystem-based store. The implementation is clean, well-tested, and addresses a legitimate performance concern with storing large witness blobs in Pebble.
The fallback-to-Pebble logic in ReadWitness and HasWitness enables transparent migration without downtime. Old witnesses are naturally pruned within 6400 blocks.
3. Atomic Writes (witness_store_fs.go:45-69)
Using temp files with atomic rename prevents partial writes on crashes:
Good update to use real Pebble on disk instead of memory DB, making the benchmark results more realistic and validating the PR's claims.
Suggestions
1. Minor: Consider os.O_SYNC for Critical Data (optional)
At witness_store_fs.go:54, os.WriteFile doesn't guarantee durability until the next fsync. For critical blockchain data, you might consider using explicit os.OpenFile with os.O_SYNC or calling file.Sync() before close. However, since witnesses are re-fetchable from peers and the atomic rename already prevents corruption, this is likely acceptable.
2. Consider Documenting Size Key Inconsistency
The FS backend uses binary.BigEndian.PutUint64 for size encoding (witness_store_fs.go:64), while the DB backend uses encodeBlockNumber (accessors_state.go:327). Both produce the same result, but documenting this alignment would help future maintainers.
3. SonarQube Security Hotspots
SonarQube flagged 7 security hotspots. These are likely related to:
File operations with user-controlled paths (directory creation)
Permission settings (0755, 0644)
These appear to be false positives since:
The witness directory path is derived from the data directory, not user input
The file permissions are appropriate for blockchain data
Potential Issues
1. Coverage Gap in database.go
Codecov shows missing coverage at core/rawdb/database.go. Based on the report, this is likely the orphan warning path:
ifopts.WitnessStoreDir!="" {
if_, err:=os.Stat(opts.WitnessStoreDir); err==nil {
log.Warn("Witness directory exists but --witness.filestore is disabled...")
}
}
This is a non-critical warning path.
2. Config Coverage Gap
The partial coverage at internal/cli/server/config.go:1608 appears to be the assignment n.WitnessFileStore = c.Witness.FileStore, which is difficult to test in isolation.
Questions for the Author
Delete during migration: When deleting a witness that exists only in Pebble (pre-migration), DeleteWitness calls os.Remove on a non-existent file, which logs nothing due to the os.IsNotExist check. Then it cleans Pebble. Is this the intended behavior? (Looks correct, just confirming.)
Enabled by default: The flag is now enabled by default (DefaultConfig().Witness.FileStore = true). Is there documentation or a changelog entry planned to inform operators about this change?
Verdict
Approve - This is a well-implemented feature that addresses a real performance concern. The code is clean, well-documented, and thoroughly tested. The migration path is thoughtful, and the benchmarks validate the claimed improvements.
Minor suggestions above are optional improvements that don't block merging.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Witness payloads are stored once per block (~every 2s) and can exceed 10MB. Storing them in Pebble increases write amplification, flush/compaction pressure, and contention on a resource shared by all chain data. Moving witness blobs to the filesystem reduces DB I/O while preserving existing behavior.
--witness.filestoreflag)Witness file path layout
<chaindata>/witnesses/<hex[0:2]>/<hex[2:4]>/0x<block-hash>.rlpWitnesses are stored under a
witnesses/subdirectory withinchaindatato keep them co-located with chain data while isolated from Pebble's internal files. A 2-level hex directory sharding (ab/cd/) distributes files across up to65,536buckets, avoiding filesystem performance degradation from large flat directories. With~6400files retained at any time, each bucket holds fewer than 1 file on average. The block hash is used as the filename since it's the primary key for all witness operations, and the.rlpextension indicates the encoding format.Transition behaviour
--witness.filestore): New witnesses go to the filesystem. Old witnesses in Pebble remain accessible via the read fallback and are naturally pruned within 6400 blocks. No manual intervention needed.--witness.filestore): New witnesses go to Pebble. Filesystem witnesses become orphaned, that is, the DB backend does not read or delete them. A warning is logged at startup:Witness directory exists but --witness.filestore is disabled. Orphaned files can be removed manually withrm -rf <chaindata>/witnesses/.Benchmark tests results
Write (the main operation this optimizes)
The DB backend's allocations scale with payload size due to Pebble's internal memtable copies and compaction buffers, while latency is dominated by flush/compaction overhead. The FS backend maintains a constant ~2.5KB allocation regardless of payload size.
Read
At realistic witness sizes (1–10MB), read performance is nearly equivalent between the two backends. Hot reads are served by the in-memory
witnessCacheLRU and rarely hit disk.Delete
FS deletes are slower (syscall vs tombstone marker), but pruning runs in the background every 120s on old blocks and is not latency-sensitive.
Changes
Breaking changes
Please complete this section if any breaking changes have been made, otherwise delete it
Nodes audience
In case this PR includes changes that must be applied only to a subset of nodes, please specify how you handled it (e.g. by adding a flag with a default value...)
Checklist
Cross repository changes
Testing
Manual tests
Please complete this section with the steps you performed if you ran manual tests for this functionality, otherwise delete it
Additional comments
Please post additional comments in this section if you have them, otherwise delete it