Motivation
Users running with --prune.include-commitment-history want to bound the on-disk size of commitment-history snapshots without resyncing. The download-side half (skipping old commitment-history files at sync time) is tracked in a companion issue and is straightforward. This issue covers the harder half: physically deleting commitment-history .kv + inverted-index files that are already on disk.
Why this is hard
Commitment-history files are not block/receipt snapshots — they are managed by the state aggregator (db/state/aggregator.go) and have a reader-owned lifecycle:
.kv files are mmaped and may be held open by live aggregator readers (RPC, txpool, sync stages).
- Files transition through
dirtyFiles → frozen visible files; the canDelete flag is the only safe way to retire them.
- Physical
os.Remove must happen on the reader-cleanup path once the last reference drops, otherwise live mmaps are invalidated and readers crash/produce garbage.
- Existing aggregator code opts frozen-file deletion out for most domains; commitment history needs an explicit opt-in (e.g. a
deleteFrozen flag on the domain config).
- The prune cycle is staged-sync–driven, so a new
pruneCommitmentHistorySnapshots step has to coordinate with the aggregator rather than touch files directly.
- Restart behaviour matters: tightening the retention between runs has to clean up leftover files on the next prune cycle; widening must be rejected (or trigger resync) because the filtered files no longer exist.
Proposed scope
- Add an aggregator path that marks commitment-history files older than the retention boundary as
canDelete, removes them from dirtyFiles, and lets the reader-cleanup path physically remove them.
- New staged-sync step (e.g.
pruneCommitmentHistorySnapshots) that runs alongside block/receipt pruning but defers to the aggregator's file lifecycle.
- Idempotent: repeated invocations on an already-pruned datadir must no-op.
- Tests: downloader delete notification, no-op on repeat prune, immediate cleanup of unpinned dirty files, deferred cleanup while frozen files have active readers.
Dependencies / ordering
Implementing this without the download-side filter is fine in principle, but the user value is much higher once they're combined — the download filter prevents the disk hit on fresh syncs; this issue handles the existing-datadir case.
Prior art
PR #21021 attempted this together with the download filter. The pruning portion is the most complex and risk-sensitive part of that change. Reviving it should probably happen after the download-filter half lands, so the moving parts can be evaluated independently.
References
Motivation
Users running with
--prune.include-commitment-historywant to bound the on-disk size of commitment-history snapshots without resyncing. The download-side half (skipping old commitment-history files at sync time) is tracked in a companion issue and is straightforward. This issue covers the harder half: physically deleting commitment-history.kv+ inverted-index files that are already on disk.Why this is hard
Commitment-history files are not block/receipt snapshots — they are managed by the state aggregator (
db/state/aggregator.go) and have a reader-owned lifecycle:.kvfiles are mmaped and may be held open by live aggregator readers (RPC, txpool, sync stages).dirtyFiles→ frozen visible files; thecanDeleteflag is the only safe way to retire them.os.Removemust happen on the reader-cleanup path once the last reference drops, otherwise live mmaps are invalidated and readers crash/produce garbage.deleteFrozenflag on the domain config).pruneCommitmentHistorySnapshotsstep has to coordinate with the aggregator rather than touch files directly.Proposed scope
canDelete, removes them fromdirtyFiles, and lets the reader-cleanup path physically remove them.pruneCommitmentHistorySnapshots) that runs alongside block/receipt pruning but defers to the aggregator's file lifecycle.Dependencies / ordering
Implementing this without the download-side filter is fine in principle, but the user value is much higher once they're combined — the download filter prevents the disk hit on fresh syncs; this issue handles the existing-datadir case.
Prior art
PR #21021 attempted this together with the download filter. The pruning portion is the most complex and risk-sensitive part of that change. Reviving it should probably happen after the download-filter half lands, so the moving parts can be evaluated independently.
References
db/state/aggregator.go— file lifecycle andcanDeletemechanicsexecution/stagedsync/stage_snapshots.go— where the new prune step would live