refactor(blockchain,p2p): move peer registry from p2p to blockchain#832
Conversation
|
🤖 Claude Code Review Status: Complete All previously reported critical and major issues have been fixed in the current code. Fixed Issues:
Architecture: Code Quality:
No new issues found. |
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-05-29 10:12 UTC |
Adds a transport-agnostic, thread-safe in-memory peer registry to the blockchain service with full gRPC API, optional JSON file persistence, and ban management with score decay. PeerInfo carries TransportType (HTTP DataHub or Bitcoin wire protocol), DataHubURL, NetworkAddress, height, hash, and reputation/ban fields. Reputation is a weighted blend of success/failure rate with response-time factor. Bans accumulate scores with periodic decay; banned peers are auto-unbanned after the configured duration. Persistence: opt-in via blockchain_peerRegistryPath. Atomic write via temp-file rename, periodic save at blockchain_peerRegistrySaveInterval, 24h TTL eviction on load, corrupt files renamed to .corrupted. This is foundation only. No service consumes it yet — P2P, Legacy, asset/dashboard, and catchup orchestration land in follow-up PRs. Coverage on new code: 84-100% per function.
…hods Adds the libp2p / catchup / sync-backoff / breakdown-metrics state that the P2P registry tracks today, so the blockchain registry can become the single source of truth. PeerInfo + PeerRegistryInfo proto gain 12 fields: IsConnected, LastBlockTime, BlocksReceived, SubtreesReceived, TransactionsReceived, CatchupBlocks, LastSyncAttempt, SyncAttemptCount, LastReputationReset, ReputationResetCount, LastCatchupError, LastCatchupErrorTime. CentralizedPeerRegistry + PeerRegistryService gain 11 RPCs / methods: UpdateConnectionState, UpdateLastMessageTime, UpdateStorage, RecordSyncAttempt, ClearAllSyncAttempts, RecordBlockReceived, RecordSubtreeReceived, RecordTransactionReceived, RecordCatchupError, ResetReputation, ReconsiderBadPeers. Cleanup(maxSize, ttl) is also ported (the periodic driver lives in a follow-up). ReconsiderBadPeers semantics swap from threshold-30/recover-50 to threshold-20/recover-30 with exponential cooldown via LastReputationReset + ReputationResetCount, matching the original P2P behaviour. ListPeers gains a sort_by_storage filter so catchup peer selection can prefer "full" peers without a separate RPC. Foundation only: still no consumer beyond existing tests.
…hain client P2P no longer hosts peer state. The local PeerRegistry, peer_registry_cache, and BanManager are deleted; the centralized blockchain peer registry (PR bsv-blockchain#832) is the single source of truth for peer identity, ban scoring, reputation, and runtime metrics. P2P holds a blockchain.PeerRegistryClientI and translates between libp2p peer.ID and the registry's canonical string ID at every boundary. Architecture: - p2p.Server.peerRegistry is now a gRPC client into the blockchain service. The daemon wires it via daemon_stores.GetPeerRegistryClient, mirroring the existing GetBlockchainClient pattern (always-gRPC, even single-binary). - SyncCoordinator drops its banManager and *PeerRegistry fields; every registry callsite goes through the client. currentSyncPeer becomes a canonical libp2p ID string instead of peer.ID. - PeerSelector takes []*blockchain.PeerInfo and returns string IDs. - BanManager's threshold bookkeeping moves to the registry's AddBanScore RPC, which already syncs BanScore/IsBanned atomically. The disconnect-on- ban side effect is preserved via Server.onPeerBanned, invoked when AddBanScore reports a NEW ban transition. - updateBytesReceived and RecordBytesDownloaded swap their racy read-modify-write for the registry's atomic UpdatePeerMetrics delta path. - p2p.ClientI signatures and p2p_api proto wire format are preserved so external consumers (asset, RPC, blockvalidation, monitor, diagnose) compile unchanged. Conversion between blockchain.PeerInfo and p2p_api.PeerRegistryInfo lives in peerInfoToP2PProto. Out of scope (PR2): - External consumers migrate from p2p.ClientI to blockchain.PeerRegistryClientI directly. - p2p_api.PeerRegistryInfo proto and the now-redundant P2P RPCs are deleted. - TTL+LRU cleanup goroutine driver moves to blockchain Server.go (the Cleanup method itself ships in PR1). - Settings rename (P2P.PeerCacheDir → BlockChain.PeerRegistryPath etc). Tests: - All P2P registry / cache / ban manager tests are deleted along with the files they exercise (~13K LOC of generated proto + test fixtures gone). - Server_test, sync_coordinator_test, peer_selector_test, server_handler_test, sync_integration_test, report_invalid_block_test, catchup_metrics_integration_test are deleted in this commit; coverage is rebuilt against the new client in PR2 (tracked). - A new blockchain.NewLocalPeerRegistryClient adapter ships in this commit for tests and same-process callers that want to avoid gRPC.
Remove was deleting both the peer entry and its ban-score entry, which broke the ban-on-threshold flow: P2P Server.onPeerBanned calls removePeer right after AddBanScore reports a fresh ban, and that path was wiping the score needed for the very next IsBanned check. The end-to-end TestPeerIDBanE2E caught the regression. Bans are intentionally independent of peer presence: a peer cannot reset its own ban by reconnecting. ClearBannedPeers is the explicit knob for wiping ban state.
3fa2071 to
250c297
Compare
- List(excludeBanned=true) acquires the write lock and runs ban expiry inline (expireBansLocked), so PeerInfo.IsBanned/BanScore are refreshed consistently with what the filter sees. - Stop() returns the save error and logs at error level. Silent failures meant banned peers could reconnect after restart on disk-full or permission failures. - maxReasonHistory cut from 50 to 20. Operators only need recent context; full history isn't useful and would scale to noticeable memory at fleet size. - bytesToBlockHash logs to stderr when the input is non-empty but invalid, which usually signals a corrupted persisted registry. Also marks go-libp2p-pubsub as indirect (no longer imported by anything in this tree after the P2P registry removal) and applies gci formatting on services/p2p/Server.go to satisfy CI.
Persistence
- Save() and Load() now serialise the banScores map alongside peers via a
versioned envelope. Banned peers are exempt from the TTL cull on Load —
otherwise a peer could clear its own ban by going quiet for the TTL
window and reconnecting. Ban entries whose BanUntil already passed are
dropped on load. New tests cover round-trip, TTL exemption, and expired
ban cleanup.
gRPC round-trip
- protoToPeerInfo unconditionally sets TransportTypeSet=true. The wire
format always carries transport_type, so a peer arriving via gRPC has
always crossed an explicit-set boundary; subsequent Register updates
that omit the field will now correctly preserve the transport.
Coverage
- New unit tests for handle_catchup_metrics (RecordCatchup{Attempt,Success,
Failure,Malicious}, UpdateCatchup{Reputation,Error}, GetPeersForCatchup,
ReportValid{Subtree,Block}, IsPeer{Malicious,Unhealthy}).
- New unit tests for peer_selector (full vs pruned, forced peer, sync
cooldown, tie-breakers).
- New server-handler tests covering peerInfoToP2PProto round-trip,
AddBanScore reasons, RecordBytesDownloaded delta accumulation,
ResetReputation, GetPeerRegistry, GetPeer, IsBanned via the registry.
Tests use blockchain.NewLocalPeerRegistryClient against an in-memory
CentralizedPeerRegistry so they exercise the same code paths the gRPC-mode
daemon uses, just bypassing the wire round-trip.
Adds 33 tests against the rewritten p2p code so sonar's new-code coverage gate has something to count. server_helpers_test.go: - addPeer / addConnectedPeer / removePeer / getPeer / updateStorage wrappers (all delegate to the centralized peer registry). - InjectPeerForTesting (storage=full marker). - getSyncPeer with no coordinator. - getPeerIDFromDataHubURL lookup hit + miss. - shouldSkipBannedPeer / shouldSkipUnhealthyPeer behaviour against the registry's ban + reputation state. - addProtocolViolation accumulating to a ban via the registry. - applyBanScore on a nil registry (no-panic). - onPeerBanned with a non-decodable peer ID (no-panic). sync_coordinator_test.go: - isViableSyncCandidate filter cases. - listAllPeers / GetCurrentSyncPeer / ClearSyncPeer. - isCaughtUp paths: no peers, ahead peer, low-rep peer. - HandlePeerDisconnected removes the registry entry. - HandleCatchupFailure with no current sync peer (no-panic). - getPeer round-trip via libp2p ID. - Backoff lifecycle (enter, reset, multiplier). - considerReputationRecovery is a no-op for healthy peers. - UpdatePeerInfo registers in the centralized registry. - UpdateBanStatus on an unknown peer (no-panic). - TriggerSync with no eligible peers does not enter backoff. - selectNewSyncPeer prefers full storage over pruned. All tests use blockchain.NewLocalPeerRegistryClient against an in-memory CentralizedPeerRegistry.
…overage CentralizedPeerRegistry now owns a stop channel and a wait group. Close() signals the decay loop and blocks until it exits, so blockchain.Server.Stop can call peerRegistry.Close() up front and snapshot Save() against a quiesced registry. Closes the goroutine-leak finding from the latest Claude review (Server.go:441) — the decay loop previously relied on service-manager ctx cancellation racing with Stop's return. Coverage: - TestBanClose_StopsDecayGoroutineWithoutContextCancel + idempotency. - TestPersistence_CorruptFileGetsRenamedAndRegistryStartsEmpty. - TestPersistence_LoadAnchorsLastDecayWhenMissing. - Server: 5 tests for updateBytesReceived (sender delta, gossip, same-peer guard, bad ID, nil registry). - SyncCoordinator: 9 more tests covering filterEligiblePeers, logPeerList / logCandidateList, sendSyncMessage error paths, evaluateSyncPeer (no peer, low rep, missing peer). 953 tests pass with -race; golangci-lint clean.
Pushes new-code coverage past the gate by exercising paths the previous test pass didn't reach. sync_coordinator_test.go (+9): - selectAndActivateNewPeer: enters backoff when no eligible peer; activates the eligible peer and stamps currentSyncPeer. - activateSyncPeer: stores the ID and syncStartTime even when the kafka send fails (no producer wired in tests). - sendSyncTriggerToKafka: nil producer / empty hash short-circuits. - Start/Stop lifecycle: Stop returns within 2s after Start (catches goroutine leaks symmetrically with the registry-side Close test). - checkAndClearExpiredBackoff: not-in-backoff vs still-in-window. handle_catchup_metrics_test.go (+9): - Invalid peer IDs return errors for RecordCatchupSuccess/Failure/Malicious and UpdateCatchupError. - ReportValidSubtree/Block invalid peer IDs. - ReportValidBlock empty-arg validation. - IsPeerUnhealthy: invalid ID, empty ID, low success rate path. 971 unit tests pass with -race; golangci-lint clean.
1. Concurrent saves used the same `<path>.tmp`, so the periodic saver + shutdown could clobber each other's temp file. savePeerRegistry now uses os.CreateTemp(dir, basename+".tmp.*") for a unique, exclusive temp file per call, with cleanup on every error path. Added concurrent- save smoke test. 2. Load dropped expired ban entries but left PeerInfo.IsBanned=true on the peer struct, so selection / cleanup paths saw the peer as banned even though IsBannedPeer() returned false. Load now reconciles PeerInfo.IsBanned / BanScore against the surviving banScores map. Added test for the expired-on-disk case. 3. AddBanScore did not refresh a stale ban: an expired ban with Banned=true would absorb the new violation's score, then a later IsBannedPeer call would wipe everything. AddBanScore now expires the stale ban entry first (clears Banned/Score/Reasons + syncs to PeerInfo) so the new strike accumulates fresh and can re-arm a new ban via the threshold check. Added test. 4. Server.onPeerBanned hardcoded a 24h address-list ban, disagreeing with the centralized registry when p2p_ban_duration was configured. Now reads s.settings.P2P.BanDuration with a 24h fallback. Test confirms the helper still completes when P2PClient is nil. 5. Untrusted peer-supplied client names are now sanitized (length cap + strip control bytes / XSS-flavoured ASCII) before storage in the registry, restoring the protection the deleted P2P registry had. Both the new-entry and update branches of Register apply the filter. Tests cover sanitization and length cap. Adjusted TestCentralizedPeerRegistry_Persistence_AllFields to also seed a banScores entry so Load's new reconciliation keeps IsBanned=true. 977 tests pass with -race; golangci-lint clean.
Adds 13 tests aimed at the new-code lines sonar still flagged. peer_registry_client_additional_test.go (+8): - UpdateConnectionState round-trip (true/false flip). - UpdateLastMessageTime advances the timestamp. - UpdateStorage stores the mode. - RecordSyncAttempt accumulates SyncAttemptCount; ClearAllSyncAttempts reports the cleared count. - RecordBlock/Subtree/Transaction increment the per-type counters and blend AvgResponseTimeMs. - RecordCatchupError stores message + timestamp. - ResetReputation single peer + all-peers count. - ReconsiderBadPeers respects cooldown (recent failure → 0). server_test.go (+2): - savePeerRegistryPeriodically writes the file at the configured tick and exits cleanly on ctx cancel. - Stop persists the registry when PeerRegistryPath is set (also exercises the new Close() drain that runs before Save). 987 unit tests pass with -race; golangci-lint clean.
1. Save serialization (Server.go:794, peer_registry_persistence.go:71). savePeerRegistry already used os.CreateTemp for unique tmp paths, but two saves could still snapshot at different times and rename out-of-order, so the final file lost the latest state. Adds a per-registry saveMu around the entire snapshot+write+rename sequence so concurrent Save() calls cannot trample one another. Test runs 32 concurrent writers and verifies the final envelope is intact with no leftover .tmp files. 2. Cleanup goroutine + configured TTL (Server.go:421, peer_registry.go). Adds CentralizedPeerRegistry.StartCleanup(ctx, interval, ttl, maxSize) which runs the existing Cleanup method on a ticker, registered with the same WaitGroup as StartBanDecay so Close() drains it cleanly. Server.Start now wires it from tSettings.P2P.PeerRegistryTTL, PeerRegistryCleanupInterval, PeerRegistryMaxSize. Load() also honours PeerRegistryTTL instead of the hardcoded 24h. Settings remain under tSettings.P2P.* in this PR; the ownership rename is a follow-up. Tests cover the loop running, exiting on Close(), and the zero-interval no-op path. 3. Expire bans before reads + cleanup (peer_registry.go). expireBansLocked now runs at the start of Get, List (any excludeBanned value), ListBannedPeers, and Cleanup. Previously only List(excludeBanned=true) ran the sweep, so dashboards / RPC clients could still see IsBanned=true on a peer whose ban window had elapsed, and Cleanup's isCleanupExempt() would treat that stale flag as a reason to keep an idle peer. Get and List now take the write lock so the snapshot they return reflects the elapsed ban window. Cleanup normalises before checking exemptions. Tests cover Get, List, and ListBannedPeers normalising an expired ban; a separate test confirms an expired-ban peer is no longer exempt from TTL cleanup. 994 unit tests pass with -race; golangci-lint clean.
…astMessageTime Cleanup was using LastMessageTime as the TTL/LRU recency signal, but several hot activity paths only refresh LastSeen: - UpdateMetrics(LastSeen = now) - recordReceivedSuccessLocked(LastSeen = LastInteractionSuccess) used by RecordBlockReceived / RecordSubtreeReceived / RecordTransactionReceived So a peer that was actively delivering blocks/subtrees but not gossiping could be evicted as stale on the first cleanup tick — a regression amplified now that StartCleanup runs by default. Persistence Load already uses LastSeen as its TTL signal, so the two paths disagreed. Adds peerActivity(info) which prefers LastSeen and falls back to LastMessageTime for older persisted records that pre-date that convention. Cleanup uses peerActivity for both the TTL pass and the LRU candidate sort. Existing TTL tests updated to backdate both freshness fields. New tests: - Cleanup keeps an active peer alive when LastSeen is fresh but LastMessageTime is stale. - Cleanup falls back to LastMessageTime when LastSeen is zero.
Previously peerActivity only fell back to LastMessageTime when LastSeen was zero, which let UpdateLastMessageTime callers get evicted: the RPC refreshes only LastMessageTime, so a peer with an old non-zero LastSeen plus a fresh LastMessageTime was treated as stale even though activity just landed. peerActivity now returns the newer of the two timestamps. Adds a regression test where LastSeen is older than the TTL, LastMessageTime is current, and Cleanup must keep the peer.
…iles Per @icellan's review feedback. The previous implementation talked directly to the filesystem (os.WriteFile / os.Rename), which matches what the old P2P registry did but doesn't compose with the blob.Store abstraction the rest of the service uses for subtrees, blocks, and temp data. Persistence now writes through a configured blob.Store, addressed by a single fixed key ("peer-registry") with a new FileTypePeerRegistry tag in pkg/fileformat. Operators choose the backend through the new blockchain_peerRegistryStore URL setting (file://, s3://, memory://, etc.); the previous blockchain_peerRegistryPath string setting is gone. - savePeerRegistry/loadPeerRegistry take (ctx, blob.Store, ...) and use Set/Get/Exists/Del. saveMu still serializes the snapshot+write so concurrent saves cannot interleave. - Save passes options.WithAllowOverwrite(true) because the registry is a single mutable blob, not append-only content. - Corrupt blob handling drops the entry via store.Del instead of renaming a sibling .corrupted file. - Server.Start constructs the blob.Store via blob.NewStore using blobstoretypes.PEERREGISTRYSTORE (new constant) and stores the reference on Blockchain. Stop saves through it then closes it. - Existing tests rewritten to use an in-memory blob.Store helper (newTestBlobStore) instead of TempDir() + filesystem assertions. 1081 unit tests pass with -race; golangci-lint clean.
- recordReceivedSuccessLocked: add InteractionAttempts++ so RecordBlockReceived and RecordSubtreeReceived paths contribute to totalAttempts in reputation calculation. Previously only UpdateMetrics paths counted attempts, producing an incorrect success rate when block/subtree paths were the primary source of positive signals. - CentralizedPeerRegistry.Get: use RLock fast path; upgrade to write lock only when the target peer carries an expired ban that needs normalising. Avoids serialising every GetPeer call (shouldSkipUnhealthyPeer, getPeer, dashboard reads) through the write lock under normal operation. - Register: stop bumping LastMessageTime unconditionally on metadata updates. LastMessageTime is now only set on new peers (matching LastSeen semantics). Unconditional updates prevented TTL eviction for peers that send metadata but no actual wire messages. - isPeerBannedLocked: accept now time.Time param so callers sharing a timestamp avoid a time.Now() call per peer in hot loops. Update List() to pass a shared timestamp. - shouldSkipUnhealthyPeer: cache reputation scores for 5 seconds per peer to avoid a gRPC round-trip on every pubsub message. Misses fall back to the registry as before. - daemon/Stores.GetPeerRegistryClient: cache the gRPC connection like other singleton clients; previously a new connection was opened on each call. - p2p/Server: rename getPeerCacheFilePath -> p2pCacheFilePath and update comment/filename to reflect that this is the libp2p peer-address cache, not the Teranode peer registry (now owned by the blockchain service). - Server.New BanConfig: document why DecayInterval/DecayAmount are hardcoded to defaults rather than wired from settings. - UpdateCatchupReputation: extend no-op comment to name the legacy consumer and note the removal condition. - peer_registry_persistence.Load: add comment explaining why TTL filter uses LastSeen only rather than peerActivity() max. - Fix PR body typo: blockchain_peerRegistryPath -> blockchain_peerRegistryStore in two places (operator notes and what's-not-in-this-PR sections).
Code ReviewReviewed against Blocking — should fix before merge1.
if !exists {
entry := *info // caller's info has IsBanned=false, BanScore=0
...
r.peers[info.ID] = &entry
return
}
if banEntry, ok := r.banScores[info.ID]; ok && banEntry.Banned {
entry.IsBanned = true
entry.BanScore = banEntry.Score
}This affects dashboards, 2.
Suggested fix: piggyback eviction on the existing Follow-ups (nice-to-have)3.
4.
5.
// Look up points for this reason, default to provided points
if configPoints, found := r.banConfig.ReasonPoints[reason]; found {
points = configPoints
}Behaviour is actually "config overrides caller". All P2P callsites pass 6.
Two Observations (not actionable)
VerdictApprove with changes. Fix #1 and #2; the rest can land separately. |
…egistry Resolves conflicts in services/p2p/Server.go and services/p2p/server_handler_test.go. Server.go: kept centralized-registry refactor (no local PeerRegistry / BanManager / cleanup ticker fields). Adopted upstream's new invalidPolicyWarnOnce sync.Once. server_handler_test.go: dropped upstream's ~980 lines of new tests because they were written against the now-removed services/p2p.PeerRegistry. Ported the oversized-message size-guard tests (the only block whose subject is upstream's new functionality rather than registry plumbing) into a new server_handler_size_test.go using blockchain.NewLocalPeerRegistryClient so the 4 size guards (rejected_tx / node_status / block / subtree) remain covered. Verified: go build ./... clean, 1101 tests pass with -race across services/blockchain and services/p2p.
… feat/central-peer-registry # Conflicts: # services/p2p/Server.go
1. Register reconciles surviving banScores on new-entry path (services/blockchain/peer_registry.go). Previously a peer evicted by TTL/restart/Cleanup could re-attach via a fresh Register call with IsBanned=false / BanScore=0, even though banScores still carried the operator's ban. The new-entry branch now trusts banScores over the caller-supplied fields, matching the Load reconciliation pattern. 2. reputationCache in shouldSkipUnhealthyPeer is now swept on every peerMapCleanup tick (services/p2p/server_helpers.go). The sync.Map was insert-only, so every unique gossiped peer ID accumulated an entry for the process lifetime. Eviction piggybacks on the existing cleanup goroutine — no new ticker. Tests: - TestRegister_ReconcilesSurvivingBanState — ban survives Remove, then Register with stale IsBanned=false still resolves to IsBanned=true. - TestRegister_NoBanScoreSetsCleanState — caller's stale truthy claims are rejected when banScores has no entry. - TestCleanupPeerMaps_EvictsExpiredReputationEntries — expired entry evicted, fresh entry survives. 1104 tests pass with -race across services/blockchain and services/p2p.
1. Save deep-copies BlockHash like Register does (peer_registry_persistence.go). Today no path mutates the underlying [32]byte (writers replace the pointer), but matching Register's pattern removes the aliasing footgun for any future code path that touches chainhash.Hash in place. 2. Logger + sidecar archive on corrupt blob (peer_registry.go, peer_registry_persistence.go). The previous os.Stderr fprintf bypassed the structured logger and the corrupt blob was deleted outright with no forensic copy. Registry now carries an optional logger (SetLogger; defaults via ulogger.New if unset) used at ERROR level for corruption events. Before deleting the primary key, the bytes are archived to a timestamped sidecar key peer-registry.corrupt-<unix> so operators can post-mortem the failure. 3. Precision-preserving LastDecay advance (peer_registry.go AddBanScore and decayBanScores). LastDecay used to clamp to "now" whenever decay applied, silently discarding the sub-interval remainder (decaySteps*interval < elapsed < (decaySteps+1)*interval) and slowly extending effective ban duration. Now advances by exactly decaySteps*DecayInterval, carrying the remainder into the next call. Tests: - TestPersistence_CorruptBlobArchivedToSidecar: corrupt blob copied to peer-registry.corrupt-<unix>, primary key removed, archive payload matches original bytes. - TestBanDecay_LastDecayPreservesSubInterval: elapsed 3.5s with 1s interval → 3 steps applied, LastDecay = anchor+3s (not now). - TestBanDecay_NoDriftAcrossManyCalls: 5 ticks across ~55ms with 10ms interval — total score loss matches elapsed/interval, no cumulative drift. 1107 tests pass with -race across services/blockchain and services/p2p.
Re-reviewBoth blocking items from the previous round are addressed, along with three additional fixes from the Copilot review. All tests pass under Blockers — resolved1. New-entry branch now consults 2. Expired entries swept on every Additional fixes from Copilot review — all good3. 4. Structured logger + sidecar archive on corrupt blob ( 5. Precision-preserving Nits (not blocking)
Verification
VerdictApprove. Blockers resolved, fixes are clean and well-tested. |
|



Summary
Moves the peer registry from
services/p2pintoservices/blockchainso the blockchain service is the single source of truth for peer identity, ban scoring, reputation, and runtime metrics.The legacy service is the immediate driver: it needs to register and query peers using the same registry as P2P so wire-protocol catchup can plug in next, and so eventually the asset/RPC/blockvalidation read paths converge on one source. P2P could not host that — it does not run in every deployment, and putting peers behind P2P meant legacy work blocked on P2P refactors. Blockchain runs everywhere, owns chain state, and is the natural shared dependency.
P2P now holds a
blockchain.PeerRegistryClientI(gRPC, matching every other blockchain client) and translates between libp2ppeer.IDand the registry's canonical string ID at every boundary. The localPeerRegistry,BanManager, and JSON cache are deleted.Why blockchain service?
What's NOT in this PR
asset/httpimpl,rpc/handlers,blockvalidation,subtreevalidation,cmd/monitor,cmd/diagnose,services/legacy) keep usingp2p.ClientI. Their migration toblockchain.PeerRegistryClientIis a follow-up.p2p_api.PeerRegistryInfoproto and the now-redundant P2P RPCs are kept for now; they get retired once consumers move.P2P.PeerCacheDir->BlockChain.PeerRegistryStore, etc.). Cleanup/TTL settings still live undertSettings.P2P.*and are read by the blockchain service unchanged.Review feedback addressed
List(excludeBanned): now acquires the write lock and runs ban expiry inline.Stop()swallowing save errors: returns the error and logs at error level.bytesToBlockHashon invalid input: writes a stderr warning.banScoresnow serialised in a versioned JSON envelope; banned peers exempt from TTL onLoad; expired bans dropped at load time.TransportTypeSetlost on gRPC round-trip:protoToPeerInfosets it to true unconditionally.Removewas wiping ban scores: bans now outlive disconnects (peers can't clear their own ban by reconnecting).StartBanDecaygoroutine leak on shutdown: registry hasClose()(stop channel + wait group);Server.Stopcalls it beforeSave, so the snapshot is taken against a quiesced registry.saveMuserializes snapshot+write+rename.StartCleanup(ctx, interval, ttl, maxSize)ships and is wired fromServer.Startusing the existingtSettings.P2P.PeerRegistry{TTL,CleanupInterval,MaxSize}knobs.Load()also uses the configured TTL instead of a hardcoded 24h.expireBansLockedruns at the start ofGet,List(any filter),ListBannedPeers, andCleanupso staleIsBanned=trueflags don't leak through or keep idle peers exempt from cleanup.LastMessageTimeonly:peerActivity(info)returns the newer ofLastSeenandLastMessageTime, so an active peer through any update path stays alive.Tests
./services/blockchain/... ./services/p2p/...,-race).TestPeerIDBanE2EandTestPeerIDBanExpirationE2Epass against the new flow.golangci-lintclean.peer_selector_test.go,sync_coordinator_test.go,server_handler_test.go,server_helpers_test.go,handle_catchup_metrics_test.go).Operator notes
blockchain_peerRegistryStoreto the persistence target (e.g.file:///path/to/dir). The legacy P2P cache file is not read.