fix(provider): purge keystore datastore after reset by guillaumemichel · Pull Request #11198 · ipfs/kubo

guillaumemichel · 2026-02-13T14:46:51Z

Context

The SweepingProvider's keystore periodically resets its contents (replacing all stored keys with the current set of CIDs to reprovide) because of the lack of CID tracking database (#11150).

Previously, the keystore used namespace.Wrap to carve out a namespace inside the shared repo datastore. When a reset occurred, old keys were deleted one-by-one from that shared datastore. This meant the underlying storage engine (LevelDB, Pebble, etc.) could retain the deleted data as tombstones until a background compaction eventually reclaimed the space (if it ever does before shutdown). On large nodes, this caused persistent disk bloat from stale provider data that was logically deleted but physically still on disk.

Solution

Use the new WithDatastoreFactory() option introduced in libp2p/go-libp2p-kad-dht#1233. Instead of namespacing within the shared repo datastore, the keystore now creates physically separate datastores under /provider-keystore/{0,1,meta}. On each reset cycle:

A new datastore is created for the alternate namespace
New keys are written to it
Namespaces are atomically swapped
The old datastore is destroyed via os.RemoveAll()

This guarantees immediate and complete disk reclamation.

The datastore type (LevelDB, Pebble, etc.) is determined from the repo's existing Datastore.Spec config via fsrepo.AnyDatastoreConfig, so it automatically matches whatever backend the user has configured.

Changes

Adopt WithDatastoreFactory() from feat(provider/keystore): alt datastore wipe libp2p/go-libp2p-kad-dht#1233 for the provider ResettableKeystore, enabling full disk reclamation on keystore reset
Add CLI integration test verifying the on-disk datastore lifecycle across reset cycles

lidel

LGTM on the factory approach and tests.

One concern: the old keystore data at /provider/keystore/{0,1,active} inside the shared repo datastore is now orphaned.

Two ideas:

On first start with the new code, check if the shared repo datastore has keys under /provider/keystore/ and if so, purge them. This is a one-time migration that cleans up the old data automatically. Without this people who reported increased disk usage will be stuck with the last queue, no?
Wire up ipfs provide clear (or similar) to do a hard reset: stop the provide system, purge both the old shared-datastore keys and the new provider-keystore/ directory, and restart provide schedule fresh. Useful for operators who want to reclaim disk space immediately without waiting for the next reprovide cycle.
iiuc ipfs diag datastore is blind to these new datastores, would be good to have means to inspect them, maybe by adding --instance=default|provider0|1

(1) feels like a blocker for this PR, (2) and (3) could be a follow-up issues if they are too involved to implement

guillaumemichel · 2026-02-17T10:24:00Z

Thanks for the review @lidel !

I implemented (1), so the orphaned keystore datastore prefix is now purged when migrating.

IMO (2) doesn't really apply. Because we don't have a better way to track when CIDs should stop being advertised (see #11150), we have to rely on the KeyChanFunc to reset all CIDs that should be advertised by the provider. So it means keys are only removed from the Keystore during a Reset() operation, which consists in creating a new datastore using the factory (e.g /0), and removing the old active datastore after the swap (e.g /1). So the datastore is purged whenever keys are removed.

Also the goal of the ipfs provide clear command was to clear the provide queue (CIDs that have never been provided yet), and not CIDs that are scheduled for reprovide (keystore).

(3) makes sense, ~~it should certainly be implemented in a follow up PR~~ implemented in d8e7d78 👍🏻

mount keystore datastores to /provider/keystore/0 and /1 so that they are included in the ipfs diag datastore command

…etions destroyDs calls os.RemoveAll with a suffix from the upstream library. If suffix were ever ".." or empty, this could delete wrong directories. Validate that suffix is "0" or "1" in both createDs and destroyDs.

If opening datastore "0" succeeds but "1" fails, MountKeystoreDatastores returned an error without closing "0".

Avoids allocating a datastore batch when no orphaned keys exist.

findRootDatastoreSpec silently returns wrapper specs it doesn't know about. If a plugin adds a wrapper with a "child" field, openDatastoreAt gets the wrapper instead of the leaf backend and fails confusingly. Log a warning so operators can spot the issue.

- explain why context.Background() is used in the migration code - add changelog note about the provide cycle restarting on upgrade - add downgrade caveat about orphaned provider-keystore directory

guillaumemichel

(Note to myself)

Reviewed until that point c22c16d 👍🏻

- chunk orphan purge into 4096-key batches to bound memory and match existing batching patterns in the same file - cancel the purge context via fx.Lifecycle OnStop so SIGINT during startup does not block indefinitely - deep-copy slices in copySpec (not just maps) so the function matches its documented "deep-copy" contract - return nil from findRootDatastoreSpec when no "/" mount exists, so callers fall back to in-memory instead of passing a mount-type spec to openDatastoreAt - rename local variable to avoid shadowing the mount package import

- add `ipfs diag datastore put` subcommand for writing arbitrary key-value pairs to the datastore (offline, experimental) - add DatastorePut harness helper for CLI tests - add TestProviderKeystoreMigrationPurge: seeds orphaned keystore keys via `put`, starts the daemon to trigger migration, verifies the orphaned keys are purged and provider-keystore/ dir is created - add put/get roundtrip test for diag datastore

lidel

@guillaumemichel lgtm, manually tested and works as expected as well.

pushed some final changes (high level details below) lmk if any concerns, otherwise we can merge libp2p/go-libp2p-kad-dht#1233 then switch this PR to go-libp2p-kad-dht@master

Pushed changes

Hardening: bitflips, or big repositories with huge old keystores

orphan purge: batched deletes (4096 keys), cancellable via SIGINT, deferred batch allocation
- i was unsure what is sane batch size, 4096 is 4x bufferedBatchSize
destroyDs: validate suffix is "0" or "1" before os.RemoveAll
findRootDatastoreSpec: graceful fallback when no / mount exists, warn on unknown wrapper types
MountKeystoreDatastores: close already-opened datastores on partial failure
copySpec: deep-copy slices too, not just maps

Observability:

log datastore open/remove with suffix and path in createDs/destroyDs callbacks

Testing:

added ipfs diag datastore put (experimental, offline-only) to complement get/count
added TestProviderKeystoreMigrationPurge: seeds orphaned keys, starts daemon, verifies purge

Docs:

changelog entry rweaked to focus on what was the problem and how it was fixed

Manual test looks ok

Fresh node, SweepEnabled=true, Interval=30s, provider logging enabled. Pre-seeded 5 orphaned keys to simulate old Kubo data.

Observed across 3 reprovide cycles:

orphaned keys purged on first start
namespaces alternate correctly (0→1→0→1)
only the active namespace directory exists on disk after each swap
all orphaned keys gone after shutdown

fix(provider): purge keystore datastore after reset

442a44d

guillaumemichel requested a review from a team as a code owner February 13, 2026 14:46

guillaumemichel mentioned this pull request Feb 13, 2026

datastore grows in size; StorageMax not respected since 0.39.0 #11096

Closed

3 tasks

guillaumemichel added 2 commits February 13, 2026 15:51

changelog

b4a877e

use MapDatastore if no datastore is configured

5ba7359

guillaumemichel mentioned this pull request Feb 13, 2026

feat(provider/keystore): alt datastore wipe libp2p/go-libp2p-kad-dht#1233

Merged

lidel requested changes Feb 13, 2026

View reviewed changes

guillaumemichel and others added 3 commits February 17, 2026 10:30

bump kad-dht to latest commit

76916a6

purge orphaned keystore migration

0f1d237

Merge branch 'master' into fix/keystore-datastore-compaction

50d6a9f

guillaumemichel requested a review from lidel February 17, 2026 10:24

guillaumemichel added 3 commits February 17, 2026 14:05

bump kad-dht

6ef1b99

use main datastore for keystore "meta" store

86d814b

add provider/keystore/0 and /1 to ipfs diag command

d8e7d78

mount keystore datastores to /provider/keystore/0 and /1 so that they are included in the ipfs diag datastore command

gammazero added the need/maintainers-input Needs input from the current maintainer(s) label Mar 3, 2026

guillaumemichel mentioned this pull request Mar 5, 2026

Release 0.41 #11082

Open

22 tasks

lidel added 7 commits March 18, 2026 02:27

fix(provider): close opened datastores when mounting partially fails

727df42

If opening datastore "0" succeeds but "1" fails, MountKeystoreDatastores returned an error without closing "0".

fix(provider): defer batch creation in orphan purge until keys are found

813f77e

Avoids allocating a datastore batch when no orphaned keys exist.

docs: document keystore migration behavior on upgrade and downgrade

b7b8286

- explain why context.Background() is used in the migration code - add changelog note about the provide cycle restarting on upgrade - add downgrade caveat about orphaned provider-keystore directory

Merge branch 'master' into fix/keystore-datastore-compaction

08f4e7c

chore(deps): bump go-libp2p-kad-dht to latest keystore factory commit

c22c16d

guillaumemichel commented Mar 18, 2026

View reviewed changes

lidel added 4 commits March 18, 2026 19:30

chore(deps): bump go-libp2p-kad-dht to 1bede74b8246

0da88af

fix(provider): log keystore datastore create and destroy operations

010e0c8

lidel force-pushed the fix/keystore-datastore-compaction branch from 6c42463 to 010e0c8 Compare March 18, 2026 20:18

docs: rewrite provider keystore changelog to focus on user impact

a16ddc0

lidel force-pushed the fix/keystore-datastore-compaction branch from 11f8074 to a16ddc0 Compare March 18, 2026 20:40

lidel approved these changes Mar 18, 2026

View reviewed changes

bump kad-dht@master

72f4a5f

guillaumemichel force-pushed the fix/keystore-datastore-compaction branch from 3a39a96 to 72f4a5f Compare March 19, 2026 10:15

guillaumemichel merged commit 99c092b into master Mar 19, 2026
21 checks passed

guillaumemichel deleted the fix/keystore-datastore-compaction branch March 19, 2026 10:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(provider): purge keystore datastore after reset#11198

fix(provider): purge keystore datastore after reset#11198
guillaumemichel merged 22 commits intomasterfrom
fix/keystore-datastore-compaction

guillaumemichel commented Feb 13, 2026

Uh oh!

lidel left a comment •

edited

Loading

Uh oh!

guillaumemichel commented Feb 17, 2026 •

edited

Loading

Uh oh!

guillaumemichel left a comment •

edited

Loading

Uh oh!

lidel left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

guillaumemichel commented Feb 13, 2026

Context

Solution

Changes

Uh oh!

lidel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guillaumemichel commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guillaumemichel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lidel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Pushed changes

Manual test looks ok

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lidel left a comment •

edited

Loading

guillaumemichel commented Feb 17, 2026 •

edited

Loading

guillaumemichel left a comment •

edited

Loading

lidel left a comment •

edited

Loading