fix(blockchain/peer-registry): post-#832 follow-ups#988
Conversation
Six small follow-ups from Siggi's re-review (bsv-blockchain#832) and Copilot nits left explicitly deferred for a follow-up PR. 1. bytesToBlockHash no longer writes to os.Stderr (peer_registry_grpc.go). Signature is now (Hash, error); protoToPeerInfo accepts a ulogger.Logger and surfaces the error via the structured logger. PeerRegistryClient gains a SetLogger method mirroring the registry's, plumbed from daemon_stores so production decoder warnings route through ulogger. 2. StartPeriodicSave moves the savePeerRegistryPeriodically loop onto the registry's own WaitGroup (peer_registry.go). Close() now drains it alongside ban-decay and TTL cleanup; tickers after Stop are impossible. The Server.go wrapper is gone. 3. persistedRegistry.Version is enforced (peer_registry_persistence.go). A blob whose Version is newer than persistedRegistryVersion is rejected with a clear error rather than silently dropping unknown fields on downgrade. 4. AddBanScore renames the param points -> defaultPoints (peer_registry.go). The comment now states explicitly that banConfig.ReasonPoints overrides the caller's value for configured reasons. Interface and gRPC client follow. 5. SetLogger / log() use atomic.Value (peer_registry.go). Previously the field was written under r.mu and read without; production flow happens to be single-call-then-many- reads, but the asymmetry was a footgun if SetLogger ever becomes runtime-reconfigurable. Switched to atomic.Value via a loggerHolder wrapper so concurrent log() readers are race-free without taking r.mu. 6. Sidecar archive key uses UnixNano (peer_registry_persistence.go). Second granularity made two corruption events in the same second collide via WithAllowOverwrite=true. Tests: - TestPersistence_RejectsFutureVersion (Version gate) - TestRegistry_SetLoggerNoRace / _NilIsNoop (atomic logger) - TestPersistence_CorruptBlobArchivedToSidecar updated for nanosecond - TestBlockchain_SavePeerRegistryPeriodically rewritten around StartPeriodicSave + Close drain - TestGRPC_BlockHashBytes* now check the new (h, err) shape go vet clean. 1162 tests pass with -race across blockchain, p2p, daemon.
|
🤖 Claude Code Review Status: Complete SummaryThis PR successfully implements 6 post-#832 follow-ups with solid test coverage and careful attention to concurrency correctness. Key changes verified:
Test coverage: 1162 tests pass under -race, including new tests for all major changes. No issues found. The implementation is clean, well-documented, and follows project conventions. |
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-06-01 16:15 UTC |
…ockers Two blockers from the PR bsv-blockchain#988 review, plus the two non-blocking nits called out alongside. ## Blocker 1 — Sidecar test was platform-dependent The previous TestPersistence_CorruptBlobArchivedToSidecar probed the keyspace at 1µs steps to find the archived blob. macOS's UnixNano() is microsecond-aligned (mod 1000 == 0), so the loop happened to land on the real key. Linux returns true-ns timestamps, which means the loop misses ~99.9% of the time. The test would have failed in CI. Fix: surface the archive key from the registry. CentralizedPeerRegistry now records the most recent sidecar key in an atomic.Pointer[string], exposed via LastCorruptArchiveKey(). The test reads the key directly instead of guessing it from wall-clock bounds — no more clock-resolution dependency. Also useful for operator post-mortems (the key can be exposed via gRPC / diagnostics later). ## Blocker 2 — Version rejection was undone by the next save Load returned an error on future-version blobs but Server.go only logged Warnf and continued. StartPeriodicSave still launched. On the first tick (or the final shutdown Save), an empty Version=1 envelope overwrote the operator's future-version data via WithAllowOverwrite=true. The version check was therefore documentation, not protection. Fix: registry-level saveDisabled atomic.Bool. Load detects errFutureRegistryVersion and latches the flag before returning. Save becomes a no-op when saveDisabled is set; StartPeriodicSave refuses to start. saveDisabled is one-way — clearing it requires constructing a fresh registry, which is the correct contract (the operator must explicitly decide whether to roll forward or restore a backup). TestPersistence_FutureVersionDisablesPersistence verifies the end-to-end behaviour: the original bytes on disk are unchanged after Load, Save, and StartPeriodicSave + Close. ## Nit 1 — PeerRegistryClient.logger now atomic.Value Matches the hardening applied to CentralizedPeerRegistry in bsv-blockchain#988. log() runs on every GetPeer / ListPeers RPC, so a plain field + post- construction SetLogger was the same race shape that motivated the registry change. Same loggerHolder wrapper, same nil-no-op semantics, plus TestPeerRegistryClient_SetLoggerNoRace as a -race guardrail. ## Nit 2 — GetPeerRegistryClient guarded by sync.Once Pre-existing race in the singleton init: nil-check + assign without a mutex meant two concurrent first-callers could each create a client (and only one would be retained). Now serialized by Stores.peerRegistryClientOnce. Error path stored on Stores.peerRegistryClientErr so the Once contract is preserved (the function is called once; the error is replayed). Other singleton getters in this file have the same shape but are out of scope for this PR. Tests: - TestPersistence_CorruptBlobArchivedToSidecar rewritten around LastCorruptArchiveKey (Linux-safe). - TestPersistence_FutureVersionDisablesPersistence: end-to-end blob protection. - TestPeerRegistryClient_SetLoggerNoRace and _NilIsNoop. go vet clean. Tests pass with -race on services/blockchain, services/p2p, daemon.
ordishs
left a comment
There was a problem hiding this comment.
Approve with minor comments.
Disciplined, well-tested cleanup — minimal diffs, every change paired with a test, build is clean. The version-gating (saveDisabled one-way latch) closes a real data-loss hole, and the UnixNano + LastCorruptArchiveKey() test seam is the right fix for the macOS-vs-Linux clock flakiness.
A few minor things to consider (none blocking):
-
sync.Oncelatches the init error permanently (daemon/daemon_stores.go). The rewrite cachespeerRegistryClientErrfor the process lifetime; the old code retriedNewPeerRegistryClienton every call after a failure. Risk is low sinceGetGRPCClientbuilds a lazy gRPC conn (fails only on permanent conditions like a malformed address), but if it can ever fail transiently during a startup race, that's a changed recovery semantic. Worth a one-line comment noting the intentional permanent caching, or confirmingGetGRPCClientnever fails transiently. -
protoToPeerInfonil-logger tolerance (peer_registry_grpc.go). Theerr != nil && logger != nilguard means an invalidBlockHasharriving via a nil-logger path is silently dropped — the same stderr-bypass problem this PR fixes, just relocated. Not a regression (prod callers pass real loggers); consider passingulogger.TestLogger{}in tests and dropping the nil tolerance so it's impossible by construction. -
Nit:
TestPersistence_RejectsFutureVersionis a subset ofTestPersistence_FutureVersionDisablesPersistence. Both are cheap and the narrower one localizes failures, so keeping both is fine — just noting the overlap.
|


Summary
Six small follow-ups deferred from #832, all touching the same files while context is fresh:
bytesToBlockHashreturns(Hash, error);protoToPeerInfotakes aulogger.Logger;PeerRegistryClient.SetLoggerplumbed viadaemon_storesos.Stderrbypasses logger)StartPeriodicSavemoves the save loop onto the registry's WaitGroup soClose()drains itpersistedRegistry.Versionis enforced — rejectVersion > persistedRegistryVersionAddBanScoreparam renamedpoints→defaultPoints; comment clarifies "config overrides caller"SetLogger/log()useatomic.Valueinstead of mismatched lock disciplineUnixNanoinstead ofUnix#5 (cleanup goroutine driver from the original plan) was already shipped in #832 —
StartCleanupis wired inservices/blockchain/Server.go.Test plan
go vet ./services/blockchain/... ./services/p2p/... ./daemon/...— cleango test -race ./services/blockchain/... ./services/p2p/... ./daemon/...— 1162 passTestPersistence_RejectsFutureVersionTestRegistry_SetLoggerNoRaceTestRegistry_SetLoggerNilIsNoopTestPersistence_CorruptBlobArchivedToSidecar(updated for nanosecond)TestBlockchain_SavePeerRegistryPeriodically(rewritten aroundStartPeriodicSave+Closedrain)TestGRPC_BlockHashBytes*(updated for new error return)Refs #832.