Skip to content

refactor(blockchain,p2p): move peer registry from p2p to blockchain#832

Merged
oskarszoon merged 26 commits into
bsv-blockchain:mainfrom
oskarszoon:feat/central-peer-registry
May 29, 2026
Merged

refactor(blockchain,p2p): move peer registry from p2p to blockchain#832
oskarszoon merged 26 commits into
bsv-blockchain:mainfrom
oskarszoon:feat/central-peer-registry

Conversation

@oskarszoon

@oskarszoon oskarszoon commented May 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Moves the peer registry from services/p2p into services/blockchain so the blockchain service is the single source of truth for peer identity, ban scoring, reputation, and runtime metrics.

The legacy service is the immediate driver: it needs to register and query peers using the same registry as P2P so wire-protocol catchup can plug in next, and so eventually the asset/RPC/blockvalidation read paths converge on one source. P2P could not host that — it does not run in every deployment, and putting peers behind P2P meant legacy work blocked on P2P refactors. Blockchain runs everywhere, owns chain state, and is the natural shared dependency.

P2P now holds a blockchain.PeerRegistryClientI (gRPC, matching every other blockchain client) and translates between libp2p peer.ID and the registry's canonical string ID at every boundary. The local PeerRegistry, BanManager, and JSON cache are deleted.

Why blockchain service?

  • It is the only service guaranteed to run in every deployment.
  • Legacy needs a registry consumer in PR2 — putting it under P2P would force legacy to depend on P2P.
  • Bans and reputation are FSM-adjacent state; the FSM authority is the right owner.

What's NOT in this PR

  • External consumers (asset/httpimpl, rpc/handlers, blockvalidation, subtreevalidation, cmd/monitor, cmd/diagnose, services/legacy) keep using p2p.ClientI. Their migration to blockchain.PeerRegistryClientI is a follow-up.
  • p2p_api.PeerRegistryInfo proto and the now-redundant P2P RPCs are kept for now; they get retired once consumers move.
  • Settings rename (P2P.PeerCacheDir -> BlockChain.PeerRegistryStore, etc.). Cleanup/TTL settings still live under tSettings.P2P.* and are read by the blockchain service unchanged.
  • No legacy JSON cache compat. Operators lose reputation history once on upgrade.

Review feedback addressed

  • Race in List(excludeBanned): now acquires the write lock and runs ban expiry inline.
  • Stop() swallowing save errors: returns the error and logs at error level.
  • Reason history scaling: capped at 20 (down from 50).
  • Silent bytesToBlockHash on invalid input: writes a stderr warning.
  • Ban state lost on restart: banScores now serialised in a versioned JSON envelope; banned peers exempt from TTL on Load; expired bans dropped at load time.
  • TransportTypeSet lost on gRPC round-trip: protoToPeerInfo sets it to true unconditionally.
  • Remove was wiping ban scores: bans now outlive disconnects (peers can't clear their own ban by reconnecting).
  • StartBanDecay goroutine leak on shutdown: registry has Close() (stop channel + wait group); Server.Stop calls it before Save, so the snapshot is taken against a quiesced registry.
  • Concurrent saves could rename out-of-order: per-registry saveMu serializes snapshot+write+rename.
  • Cleanup goroutine + configured TTL: StartCleanup(ctx, interval, ttl, maxSize) ships and is wired from Server.Start using the existing tSettings.P2P.PeerRegistry{TTL,CleanupInterval,MaxSize} knobs. Load() also uses the configured TTL instead of a hardcoded 24h.
  • Expired bans visible to readers: expireBansLocked runs at the start of Get, List (any filter), ListBannedPeers, and Cleanup so stale IsBanned=true flags don't leak through or keep idle peers exempt from cleanup.
  • Cleanup recency on LastMessageTime only: peerActivity(info) returns the newer of LastSeen and LastMessageTime, so an active peer through any update path stays alive.

Tests

  • ~1080 unit tests pass (./services/blockchain/... ./services/p2p/..., -race).
  • E2E TestPeerIDBanE2E and TestPeerIDBanExpirationE2E pass against the new flow.
  • golangci-lint clean.
  • Sonar gate green: ≥80% new-code coverage, 0 security hotspots, 0% duplication.
  • P2P-side legacy server/sync_coordinator/peer_selector tests are deleted in this commit; coverage is rebuilt against the new client (peer_selector_test.go, sync_coordinator_test.go, server_handler_test.go, server_helpers_test.go, handle_catchup_metrics_test.go).

Operator notes

  • Set blockchain_peerRegistryStore to the persistence target (e.g. file:///path/to/dir). The legacy P2P cache file is not read.
  • Reputation history rebuilds from gossip on first restart after upgrade; expect a few hours before scores stabilise.
  • Bans and ban-score history persist across restarts via the same registry file.

@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete

All previously reported critical and major issues have been fixed in the current code.

Fixed Issues:

  • ✅ BlockHash pointer aliasing (data corruption risk) - Deep copy now implemented
  • ✅ Silent data loss on corruption - Error logging and corrupt blob archival added
  • ✅ Ban score decay drift - LastDecay now preserves fractional time
  • ✅ Goroutine leak on shutdown - Close() method properly drains background tasks

Architecture:
The peer registry migration from P2P to Blockchain service is well-executed. The centralized registry now serves as the single source of truth for peer state across all transport types (HTTP DataHub, wire protocol), with proper gRPC client interfaces and blob store persistence.

Code Quality:

  • Comprehensive test coverage (~1080 tests passing)
  • Thread-safe with proper mutex usage
  • Clean separation of concerns (persistence, gRPC, core registry logic)
  • Good error handling with structured logging
  • Settings properly documented in blockchain_settings.go

No new issues found.

@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-832 (5adc054)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 144
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.629µ 1.628µ ~ 0.400
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 71.21n 71.83n ~ 0.100
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 71.07n 71.33n ~ 0.700
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 70.92n 71.44n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 32.60n 33.07n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 56.87n 58.38n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 127.5n 133.7n ~ 0.100
MiningCandidate_Stringify_Short-4 220.9n 222.8n ~ 0.200
MiningCandidate_Stringify_Long-4 1.647µ 1.658µ ~ 0.200
MiningSolution_Stringify-4 862.7n 849.1n ~ 0.700
BlockInfo_MarshalJSON-4 1.796µ 1.742µ ~ 0.100
NewFromBytes-4 124.1n 149.2n ~ 0.100
AddTxBatchColumnar_Validation-4 2.621µ 2.508µ ~ 0.400
OffsetValidationLoop-4 544.1n 545.2n ~ 1.000
Mine_EasyDifficulty-4 60.68µ 61.44µ ~ 0.100
Mine_WithAddress-4 6.960µ 7.169µ ~ 0.400
BlockAssembler_AddTx-4 0.02655n 0.02609n ~ 0.700
AddNode-4 11.02 10.90 ~ 0.700
AddNodeWithMap-4 11.74 12.13 ~ 0.100
DirectSubtreeAdd/4_per_subtree-4 57.41n 58.37n ~ 0.700
DirectSubtreeAdd/64_per_subtree-4 29.87n 29.08n ~ 0.100
DirectSubtreeAdd/256_per_subtree-4 29.25n 28.28n ~ 0.100
DirectSubtreeAdd/1024_per_subtree-4 27.90n 27.04n ~ 0.100
DirectSubtreeAdd/2048_per_subtree-4 27.54n 26.74n ~ 0.100
SubtreeProcessorAdd/4_per_subtree-4 280.8n 279.7n ~ 1.000
SubtreeProcessorAdd/64_per_subtree-4 278.1n 277.8n ~ 0.800
SubtreeProcessorAdd/256_per_subtree-4 280.0n 279.0n ~ 0.400
SubtreeProcessorAdd/1024_per_subtree-4 268.7n 272.7n ~ 0.100
SubtreeProcessorAdd/2048_per_subtree-4 268.8n 272.4n ~ 0.100
SubtreeProcessorRotate/4_per_subtree-4 274.0n 274.4n ~ 0.500
SubtreeProcessorRotate/64_per_subtree-4 272.7n 273.0n ~ 1.000
SubtreeProcessorRotate/256_per_subtree-4 271.8n 273.3n ~ 0.600
SubtreeProcessorRotate/1024_per_subtree-4 272.0n 273.8n ~ 0.100
SubtreeNodeAddOnly/4_per_subtree-4 54.30n 54.43n ~ 1.000
SubtreeNodeAddOnly/64_per_subtree-4 34.24n 34.14n ~ 0.100
SubtreeNodeAddOnly/256_per_subtree-4 33.28n 33.13n ~ 0.600
SubtreeNodeAddOnly/1024_per_subtree-4 32.48n 32.50n ~ 1.000
SubtreeCreationOnly/4_per_subtree-4 113.4n 113.0n ~ 0.700
SubtreeCreationOnly/64_per_subtree-4 398.5n 396.0n ~ 0.400
SubtreeCreationOnly/256_per_subtree-4 1.362µ 1.343µ ~ 0.200
SubtreeCreationOnly/1024_per_subtree-4 4.406µ 4.485µ ~ 0.100
SubtreeCreationOnly/2048_per_subtree-4 8.086µ 7.992µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 272.4n 272.4n ~ 0.800
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 273.4n 271.6n ~ 0.400
ParallelGetAndSetIfNotExists/1k_nodes-4 2.195m 2.190m ~ 1.000
ParallelGetAndSetIfNotExists/10k_nodes-4 5.439m 5.512m ~ 0.100
ParallelGetAndSetIfNotExists/50k_nodes-4 7.130m 7.518m ~ 0.100
ParallelGetAndSetIfNotExists/100k_nodes-4 9.671m 9.656m ~ 1.000
SequentialGetAndSetIfNotExists/1k_nodes-4 1.929m 1.937m ~ 0.400
SequentialGetAndSetIfNotExists/10k_nodes-4 4.296m 4.324m ~ 0.400
SequentialGetAndSetIfNotExists/50k_nodes-4 12.21m 12.08m ~ 0.100
SequentialGetAndSetIfNotExists/100k_nodes-4 21.99m 21.88m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 2.236m 2.225m ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 8.180m 8.149m ~ 0.200
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 12.81m 12.82m ~ 1.000
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 1.966m 1.959m ~ 0.400
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 7.335m 7.333m ~ 1.000
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 39.28m 39.39m ~ 0.700
DiskTxMap_SetIfNotExists-4 3.985µ 3.933µ ~ 0.400
DiskTxMap_SetIfNotExists_Parallel-4 3.551µ 3.679µ ~ 0.100
DiskTxMap_ExistenceOnly-4 345.5n 470.9n ~ 0.400
Queue-4 189.0n 193.1n ~ 0.100
AtomicPointer-4 3.752n 3.632n ~ 0.100
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 826.6µ 856.1µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/10K-4 764.8µ 766.0µ ~ 1.000
ReorgOptimizations/AllMarkFalse/Old/10K-4 106.5µ 108.1µ ~ 0.200
ReorgOptimizations/AllMarkFalse/New/10K-4 64.47µ 64.13µ ~ 0.200
ReorgOptimizations/HashSlicePool/Old/10K-4 51.62µ 55.97µ ~ 0.200
ReorgOptimizations/HashSlicePool/New/10K-4 11.29µ 11.19µ ~ 0.700
ReorgOptimizations/NodeFlags/Old/10K-4 4.656µ 4.527µ ~ 0.100
ReorgOptimizations/NodeFlags/New/10K-4 1.592µ 1.552µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 10.029m 9.729m ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/100K-4 11.16m 10.66m ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.128m 1.145m ~ 0.400
ReorgOptimizations/AllMarkFalse/New/100K-4 707.5µ 706.2µ ~ 0.100
ReorgOptimizations/HashSlicePool/Old/100K-4 579.6µ 546.1µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/100K-4 207.1µ 208.2µ ~ 0.700
ReorgOptimizations/NodeFlags/Old/100K-4 47.47µ 49.15µ ~ 0.100
ReorgOptimizations/NodeFlags/New/100K-4 17.06µ 17.23µ ~ 0.200
TxMapSetIfNotExists-4 49.56n 49.62n ~ 1.000
TxMapSetIfNotExistsDuplicate-4 41.23n 41.24n ~ 1.000
ChannelSendReceive-4 613.0n 592.0n ~ 0.100
CalcBlockWork-4 476.1n 472.3n ~ 0.700
CalculateWork-4 641.6n 651.5n ~ 0.700
BuildBlockLocatorString_Helpers/Size_10-4 1.372µ 1.360µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_100-4 15.10µ 14.78µ ~ 1.000
BuildBlockLocatorString_Helpers/Size_1000-4 128.8µ 128.8µ ~ 0.700
CatchupWithHeaderCache-4 104.6m 104.4m ~ 0.100
_BufferPoolAllocation/16KB-4 5.492µ 4.082µ ~ 0.100
_BufferPoolAllocation/32KB-4 9.590µ 11.037µ ~ 0.700
_BufferPoolAllocation/64KB-4 19.21µ 16.56µ ~ 0.100
_BufferPoolAllocation/128KB-4 37.45µ 33.89µ ~ 0.400
_BufferPoolAllocation/512KB-4 129.6µ 117.6µ ~ 0.100
_BufferPoolConcurrent/32KB-4 23.44µ 20.55µ ~ 0.400
_BufferPoolConcurrent/64KB-4 30.99µ 32.33µ ~ 0.200
_BufferPoolConcurrent/512KB-4 144.6µ 152.4µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/16KB-4 646.8µ 682.6µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 646.0µ 707.8µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/64KB-4 640.8µ 701.7µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/128KB-4 632.3µ 704.3µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 612.1µ 620.2µ ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/16KB-4 36.62m 36.48m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/32KB-4 36.27m 36.61m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/64KB-4 36.47m 36.41m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/128KB-4 36.28m 36.13m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/512KB-4 36.14m 36.03m ~ 0.700
_PooledVsNonPooled/Pooled-4 739.4n 739.0n ~ 1.000
_PooledVsNonPooled/NonPooled-4 8.186µ 8.601µ ~ 1.000
_MemoryFootprint/Current_512KB_32concurrent-4 6.419µ 6.529µ ~ 0.100
_MemoryFootprint/Proposed_32KB_32concurrent-4 9.518µ 9.696µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 9.231µ 9.176µ ~ 0.400
SubtreeSizes/10k_tx_4_per_subtree-4 1.364m 1.415m ~ 0.400
SubtreeSizes/10k_tx_16_per_subtree-4 323.8µ 332.7µ ~ 0.100
SubtreeSizes/10k_tx_64_per_subtree-4 77.78µ 78.46µ ~ 0.100
SubtreeSizes/10k_tx_256_per_subtree-4 19.21µ 19.54µ ~ 0.100
SubtreeSizes/10k_tx_512_per_subtree-4 9.652µ 9.810µ ~ 0.200
SubtreeSizes/10k_tx_1024_per_subtree-4 4.843µ 4.842µ ~ 1.000
SubtreeSizes/10k_tx_2k_per_subtree-4 2.436µ 2.405µ ~ 0.300
BlockSizeScaling/10k_tx_64_per_subtree-4 76.49µ 76.75µ ~ 1.000
BlockSizeScaling/10k_tx_256_per_subtree-4 19.47µ 19.45µ ~ 1.000
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.814µ 4.823µ ~ 0.400
BlockSizeScaling/50k_tx_64_per_subtree-4 403.7µ 400.5µ ~ 0.400
BlockSizeScaling/50k_tx_256_per_subtree-4 97.19µ 96.74µ ~ 0.700
BlockSizeScaling/50k_tx_1024_per_subtree-4 24.17µ 23.99µ ~ 0.400
SubtreeAllocations/small_subtrees_exists_check-4 165.6µ 162.5µ ~ 0.400
SubtreeAllocations/small_subtrees_data_fetch-4 167.9µ 170.3µ ~ 0.200
SubtreeAllocations/small_subtrees_full_validation-4 339.2µ 336.4µ ~ 0.400
SubtreeAllocations/medium_subtrees_exists_check-4 9.698µ 9.653µ ~ 0.400
SubtreeAllocations/medium_subtrees_data_fetch-4 9.802µ 10.107µ ~ 0.100
SubtreeAllocations/medium_subtrees_full_validation-4 19.55µ 19.56µ ~ 1.000
SubtreeAllocations/large_subtrees_exists_check-4 2.322µ 2.305µ ~ 0.400
SubtreeAllocations/large_subtrees_data_fetch-4 2.387µ 2.449µ ~ 0.100
SubtreeAllocations/large_subtrees_full_validation-4 4.854µ 4.915µ ~ 0.700
_prepareTxsPerLevel-4 386.5m 384.8m ~ 1.000
_prepareTxsPerLevelOrdered-4 4.454m 4.462m ~ 1.000
_prepareTxsPerLevel_Comparison/Original-4 382.3m 390.4m ~ 0.700
_prepareTxsPerLevel_Comparison/Optimized-4 4.558m 4.163m ~ 0.100
StoreBlock_Sequential/BelowCSVHeight-4 326.7µ 329.8µ ~ 0.100
StoreBlock_Sequential/AboveCSVHeight-4 329.7µ 330.2µ ~ 0.700
GetUtxoHashes-4 260.3n 255.3n ~ 0.200
GetUtxoHashes_ManyOutputs-4 45.25µ 43.69µ ~ 0.100
_NewMetaDataFromBytes-4 228.6n 231.7n ~ 0.100
_Bytes-4 411.2n 411.3n ~ 1.000
_MetaBytes-4 138.5n 139.7n ~ 0.600

Threshold: >10% with p < 0.05 | Generated: 2026-05-29 10:12 UTC

@oskarszoon oskarszoon marked this pull request as draft May 8, 2026 12:05
oskarszoon added 4 commits May 8, 2026 14:25
Adds a transport-agnostic, thread-safe in-memory peer registry to the
blockchain service with full gRPC API, optional JSON file persistence,
and ban management with score decay.

PeerInfo carries TransportType (HTTP DataHub or Bitcoin wire protocol),
DataHubURL, NetworkAddress, height, hash, and reputation/ban fields.
Reputation is a weighted blend of success/failure rate with response-time
factor. Bans accumulate scores with periodic decay; banned peers are
auto-unbanned after the configured duration.

Persistence: opt-in via blockchain_peerRegistryPath. Atomic write via
temp-file rename, periodic save at blockchain_peerRegistrySaveInterval,
24h TTL eviction on load, corrupt files renamed to .corrupted.

This is foundation only. No service consumes it yet — P2P, Legacy,
asset/dashboard, and catchup orchestration land in follow-up PRs.

Coverage on new code: 84-100% per function.
…hods

Adds the libp2p / catchup / sync-backoff / breakdown-metrics state that the
P2P registry tracks today, so the blockchain registry can become the single
source of truth.

PeerInfo + PeerRegistryInfo proto gain 12 fields:
IsConnected, LastBlockTime, BlocksReceived, SubtreesReceived,
TransactionsReceived, CatchupBlocks, LastSyncAttempt, SyncAttemptCount,
LastReputationReset, ReputationResetCount, LastCatchupError,
LastCatchupErrorTime.

CentralizedPeerRegistry + PeerRegistryService gain 11 RPCs / methods:
UpdateConnectionState, UpdateLastMessageTime, UpdateStorage,
RecordSyncAttempt, ClearAllSyncAttempts, RecordBlockReceived,
RecordSubtreeReceived, RecordTransactionReceived, RecordCatchupError,
ResetReputation, ReconsiderBadPeers. Cleanup(maxSize, ttl) is also
ported (the periodic driver lives in a follow-up).

ReconsiderBadPeers semantics swap from threshold-30/recover-50 to
threshold-20/recover-30 with exponential cooldown via LastReputationReset
+ ReputationResetCount, matching the original P2P behaviour.

ListPeers gains a sort_by_storage filter so catchup peer selection can
prefer "full" peers without a separate RPC.

Foundation only: still no consumer beyond existing tests.
…hain client

P2P no longer hosts peer state. The local PeerRegistry, peer_registry_cache,
and BanManager are deleted; the centralized blockchain peer registry (PR bsv-blockchain#832)
is the single source of truth for peer identity, ban scoring, reputation, and
runtime metrics. P2P holds a blockchain.PeerRegistryClientI and translates
between libp2p peer.ID and the registry's canonical string ID at every
boundary.

Architecture:
- p2p.Server.peerRegistry is now a gRPC client into the blockchain service.
  The daemon wires it via daemon_stores.GetPeerRegistryClient, mirroring the
  existing GetBlockchainClient pattern (always-gRPC, even single-binary).
- SyncCoordinator drops its banManager and *PeerRegistry fields; every
  registry callsite goes through the client. currentSyncPeer becomes a
  canonical libp2p ID string instead of peer.ID.
- PeerSelector takes []*blockchain.PeerInfo and returns string IDs.
- BanManager's threshold bookkeeping moves to the registry's AddBanScore
  RPC, which already syncs BanScore/IsBanned atomically. The disconnect-on-
  ban side effect is preserved via Server.onPeerBanned, invoked when AddBanScore
  reports a NEW ban transition.
- updateBytesReceived and RecordBytesDownloaded swap their racy
  read-modify-write for the registry's atomic UpdatePeerMetrics delta path.
- p2p.ClientI signatures and p2p_api proto wire format are preserved so
  external consumers (asset, RPC, blockvalidation, monitor, diagnose)
  compile unchanged. Conversion between blockchain.PeerInfo and
  p2p_api.PeerRegistryInfo lives in peerInfoToP2PProto.

Out of scope (PR2):
- External consumers migrate from p2p.ClientI to blockchain.PeerRegistryClientI
  directly.
- p2p_api.PeerRegistryInfo proto and the now-redundant P2P RPCs are deleted.
- TTL+LRU cleanup goroutine driver moves to blockchain Server.go (the
  Cleanup method itself ships in PR1).
- Settings rename (P2P.PeerCacheDir → BlockChain.PeerRegistryPath etc).

Tests:
- All P2P registry / cache / ban manager tests are deleted along with the
  files they exercise (~13K LOC of generated proto + test fixtures gone).
- Server_test, sync_coordinator_test, peer_selector_test, server_handler_test,
  sync_integration_test, report_invalid_block_test, catchup_metrics_integration_test
  are deleted in this commit; coverage is rebuilt against the new client in
  PR2 (tracked).
- A new blockchain.NewLocalPeerRegistryClient adapter ships in this commit
  for tests and same-process callers that want to avoid gRPC.
Remove was deleting both the peer entry and its ban-score entry, which broke
the ban-on-threshold flow: P2P Server.onPeerBanned calls removePeer right
after AddBanScore reports a fresh ban, and that path was wiping the score
needed for the very next IsBanned check. The end-to-end TestPeerIDBanE2E
caught the regression.

Bans are intentionally independent of peer presence: a peer cannot reset its
own ban by reconnecting. ClearBannedPeers is the explicit knob for wiping
ban state.
@oskarszoon oskarszoon force-pushed the feat/central-peer-registry branch from 3fa2071 to 250c297 Compare May 8, 2026 13:30
@oskarszoon oskarszoon changed the title feat(blockchain): add centralized peer registry foundation refactor(blockchain,p2p): move peer registry from p2p to blockchain May 8, 2026
@oskarszoon oskarszoon marked this pull request as ready for review May 8, 2026 13:44
oskarszoon added 3 commits May 8, 2026 15:54
- List(excludeBanned=true) acquires the write lock and runs ban expiry
  inline (expireBansLocked), so PeerInfo.IsBanned/BanScore are refreshed
  consistently with what the filter sees.
- Stop() returns the save error and logs at error level. Silent failures
  meant banned peers could reconnect after restart on disk-full or
  permission failures.
- maxReasonHistory cut from 50 to 20. Operators only need recent context;
  full history isn't useful and would scale to noticeable memory at fleet
  size.
- bytesToBlockHash logs to stderr when the input is non-empty but invalid,
  which usually signals a corrupted persisted registry.

Also marks go-libp2p-pubsub as indirect (no longer imported by anything in
this tree after the P2P registry removal) and applies gci formatting on
services/p2p/Server.go to satisfy CI.
Persistence
- Save() and Load() now serialise the banScores map alongside peers via a
  versioned envelope. Banned peers are exempt from the TTL cull on Load —
  otherwise a peer could clear its own ban by going quiet for the TTL
  window and reconnecting. Ban entries whose BanUntil already passed are
  dropped on load. New tests cover round-trip, TTL exemption, and expired
  ban cleanup.

gRPC round-trip
- protoToPeerInfo unconditionally sets TransportTypeSet=true. The wire
  format always carries transport_type, so a peer arriving via gRPC has
  always crossed an explicit-set boundary; subsequent Register updates
  that omit the field will now correctly preserve the transport.

Coverage
- New unit tests for handle_catchup_metrics (RecordCatchup{Attempt,Success,
  Failure,Malicious}, UpdateCatchup{Reputation,Error}, GetPeersForCatchup,
  ReportValid{Subtree,Block}, IsPeer{Malicious,Unhealthy}).
- New unit tests for peer_selector (full vs pruned, forced peer, sync
  cooldown, tie-breakers).
- New server-handler tests covering peerInfoToP2PProto round-trip,
  AddBanScore reasons, RecordBytesDownloaded delta accumulation,
  ResetReputation, GetPeerRegistry, GetPeer, IsBanned via the registry.

Tests use blockchain.NewLocalPeerRegistryClient against an in-memory
CentralizedPeerRegistry so they exercise the same code paths the gRPC-mode
daemon uses, just bypassing the wire round-trip.
Adds 33 tests against the rewritten p2p code so sonar's new-code coverage
gate has something to count.

server_helpers_test.go:
- addPeer / addConnectedPeer / removePeer / getPeer / updateStorage
  wrappers (all delegate to the centralized peer registry).
- InjectPeerForTesting (storage=full marker).
- getSyncPeer with no coordinator.
- getPeerIDFromDataHubURL lookup hit + miss.
- shouldSkipBannedPeer / shouldSkipUnhealthyPeer behaviour against the
  registry's ban + reputation state.
- addProtocolViolation accumulating to a ban via the registry.
- applyBanScore on a nil registry (no-panic).
- onPeerBanned with a non-decodable peer ID (no-panic).

sync_coordinator_test.go:
- isViableSyncCandidate filter cases.
- listAllPeers / GetCurrentSyncPeer / ClearSyncPeer.
- isCaughtUp paths: no peers, ahead peer, low-rep peer.
- HandlePeerDisconnected removes the registry entry.
- HandleCatchupFailure with no current sync peer (no-panic).
- getPeer round-trip via libp2p ID.
- Backoff lifecycle (enter, reset, multiplier).
- considerReputationRecovery is a no-op for healthy peers.
- UpdatePeerInfo registers in the centralized registry.
- UpdateBanStatus on an unknown peer (no-panic).
- TriggerSync with no eligible peers does not enter backoff.
- selectNewSyncPeer prefers full storage over pruned.

All tests use blockchain.NewLocalPeerRegistryClient against an in-memory
CentralizedPeerRegistry.
Comment thread services/blockchain/Server.go
oskarszoon added 7 commits May 8, 2026 18:58
…overage

CentralizedPeerRegistry now owns a stop channel and a wait group. Close()
signals the decay loop and blocks until it exits, so blockchain.Server.Stop
can call peerRegistry.Close() up front and snapshot Save() against a
quiesced registry. Closes the goroutine-leak finding from the latest
Claude review (Server.go:441) — the decay loop previously relied on
service-manager ctx cancellation racing with Stop's return.

Coverage:
- TestBanClose_StopsDecayGoroutineWithoutContextCancel + idempotency.
- TestPersistence_CorruptFileGetsRenamedAndRegistryStartsEmpty.
- TestPersistence_LoadAnchorsLastDecayWhenMissing.
- Server: 5 tests for updateBytesReceived (sender delta, gossip, same-peer
  guard, bad ID, nil registry).
- SyncCoordinator: 9 more tests covering filterEligiblePeers,
  logPeerList / logCandidateList, sendSyncMessage error paths,
  evaluateSyncPeer (no peer, low rep, missing peer).

953 tests pass with -race; golangci-lint clean.
Pushes new-code coverage past the gate by exercising paths the previous
test pass didn't reach.

sync_coordinator_test.go (+9):
- selectAndActivateNewPeer: enters backoff when no eligible peer; activates
  the eligible peer and stamps currentSyncPeer.
- activateSyncPeer: stores the ID and syncStartTime even when the kafka
  send fails (no producer wired in tests).
- sendSyncTriggerToKafka: nil producer / empty hash short-circuits.
- Start/Stop lifecycle: Stop returns within 2s after Start (catches
  goroutine leaks symmetrically with the registry-side Close test).
- checkAndClearExpiredBackoff: not-in-backoff vs still-in-window.

handle_catchup_metrics_test.go (+9):
- Invalid peer IDs return errors for RecordCatchupSuccess/Failure/Malicious
  and UpdateCatchupError.
- ReportValidSubtree/Block invalid peer IDs.
- ReportValidBlock empty-arg validation.
- IsPeerUnhealthy: invalid ID, empty ID, low success rate path.

971 unit tests pass with -race; golangci-lint clean.
1. Concurrent saves used the same `<path>.tmp`, so the periodic saver +
   shutdown could clobber each other's temp file. savePeerRegistry now
   uses os.CreateTemp(dir, basename+".tmp.*") for a unique, exclusive
   temp file per call, with cleanup on every error path. Added concurrent-
   save smoke test.

2. Load dropped expired ban entries but left PeerInfo.IsBanned=true on the
   peer struct, so selection / cleanup paths saw the peer as banned even
   though IsBannedPeer() returned false. Load now reconciles
   PeerInfo.IsBanned / BanScore against the surviving banScores map.
   Added test for the expired-on-disk case.

3. AddBanScore did not refresh a stale ban: an expired ban with
   Banned=true would absorb the new violation's score, then a later
   IsBannedPeer call would wipe everything. AddBanScore now expires the
   stale ban entry first (clears Banned/Score/Reasons + syncs to PeerInfo)
   so the new strike accumulates fresh and can re-arm a new ban via the
   threshold check. Added test.

4. Server.onPeerBanned hardcoded a 24h address-list ban, disagreeing with
   the centralized registry when p2p_ban_duration was configured. Now
   reads s.settings.P2P.BanDuration with a 24h fallback. Test confirms
   the helper still completes when P2PClient is nil.

5. Untrusted peer-supplied client names are now sanitized (length cap +
   strip control bytes / XSS-flavoured ASCII) before storage in the
   registry, restoring the protection the deleted P2P registry had. Both
   the new-entry and update branches of Register apply the filter. Tests
   cover sanitization and length cap.

Adjusted TestCentralizedPeerRegistry_Persistence_AllFields to also seed a
banScores entry so Load's new reconciliation keeps IsBanned=true.

977 tests pass with -race; golangci-lint clean.
Adds 13 tests aimed at the new-code lines sonar still flagged.

peer_registry_client_additional_test.go (+8):
- UpdateConnectionState round-trip (true/false flip).
- UpdateLastMessageTime advances the timestamp.
- UpdateStorage stores the mode.
- RecordSyncAttempt accumulates SyncAttemptCount; ClearAllSyncAttempts
  reports the cleared count.
- RecordBlock/Subtree/Transaction increment the per-type counters and
  blend AvgResponseTimeMs.
- RecordCatchupError stores message + timestamp.
- ResetReputation single peer + all-peers count.
- ReconsiderBadPeers respects cooldown (recent failure → 0).

server_test.go (+2):
- savePeerRegistryPeriodically writes the file at the configured tick
  and exits cleanly on ctx cancel.
- Stop persists the registry when PeerRegistryPath is set (also exercises
  the new Close() drain that runs before Save).

987 unit tests pass with -race; golangci-lint clean.
1. Save serialization (Server.go:794, peer_registry_persistence.go:71).
   savePeerRegistry already used os.CreateTemp for unique tmp paths, but
   two saves could still snapshot at different times and rename
   out-of-order, so the final file lost the latest state. Adds a
   per-registry saveMu around the entire snapshot+write+rename sequence
   so concurrent Save() calls cannot trample one another. Test runs 32
   concurrent writers and verifies the final envelope is intact with no
   leftover .tmp files.

2. Cleanup goroutine + configured TTL (Server.go:421, peer_registry.go).
   Adds CentralizedPeerRegistry.StartCleanup(ctx, interval, ttl, maxSize)
   which runs the existing Cleanup method on a ticker, registered with
   the same WaitGroup as StartBanDecay so Close() drains it cleanly.
   Server.Start now wires it from tSettings.P2P.PeerRegistryTTL,
   PeerRegistryCleanupInterval, PeerRegistryMaxSize. Load() also honours
   PeerRegistryTTL instead of the hardcoded 24h. Settings remain under
   tSettings.P2P.* in this PR; the ownership rename is a follow-up.
   Tests cover the loop running, exiting on Close(), and the
   zero-interval no-op path.

3. Expire bans before reads + cleanup (peer_registry.go).
   expireBansLocked now runs at the start of Get, List (any
   excludeBanned value), ListBannedPeers, and Cleanup. Previously only
   List(excludeBanned=true) ran the sweep, so dashboards / RPC clients
   could still see IsBanned=true on a peer whose ban window had elapsed,
   and Cleanup's isCleanupExempt() would treat that stale flag as a
   reason to keep an idle peer. Get and List now take the write lock so
   the snapshot they return reflects the elapsed ban window. Cleanup
   normalises before checking exemptions. Tests cover Get, List, and
   ListBannedPeers normalising an expired ban; a separate test confirms
   an expired-ban peer is no longer exempt from TTL cleanup.

994 unit tests pass with -race; golangci-lint clean.
…astMessageTime

Cleanup was using LastMessageTime as the TTL/LRU recency signal, but several
hot activity paths only refresh LastSeen:

- UpdateMetrics(LastSeen = now)
- recordReceivedSuccessLocked(LastSeen = LastInteractionSuccess) used by
  RecordBlockReceived / RecordSubtreeReceived / RecordTransactionReceived

So a peer that was actively delivering blocks/subtrees but not gossiping
could be evicted as stale on the first cleanup tick — a regression
amplified now that StartCleanup runs by default. Persistence Load already
uses LastSeen as its TTL signal, so the two paths disagreed.

Adds peerActivity(info) which prefers LastSeen and falls back to
LastMessageTime for older persisted records that pre-date that
convention. Cleanup uses peerActivity for both the TTL pass and the
LRU candidate sort.

Existing TTL tests updated to backdate both freshness fields. New tests:
- Cleanup keeps an active peer alive when LastSeen is fresh but
  LastMessageTime is stale.
- Cleanup falls back to LastMessageTime when LastSeen is zero.
Previously peerActivity only fell back to LastMessageTime when LastSeen
was zero, which let UpdateLastMessageTime callers get evicted: the RPC
refreshes only LastMessageTime, so a peer with an old non-zero LastSeen
plus a fresh LastMessageTime was treated as stale even though activity
just landed.

peerActivity now returns the newer of the two timestamps. Adds a
regression test where LastSeen is older than the TTL, LastMessageTime
is current, and Cleanup must keep the peer.
@oskarszoon oskarszoon requested review from galt-tr and icellan May 9, 2026 18:45
Comment thread services/blockchain/peer_registry_persistence.go Outdated
…iles

Per @icellan's review feedback. The previous implementation talked
directly to the filesystem (os.WriteFile / os.Rename), which matches
what the old P2P registry did but doesn't compose with the blob.Store
abstraction the rest of the service uses for subtrees, blocks, and
temp data.

Persistence now writes through a configured blob.Store, addressed by a
single fixed key ("peer-registry") with a new FileTypePeerRegistry tag
in pkg/fileformat. Operators choose the backend through the new
blockchain_peerRegistryStore URL setting (file://, s3://, memory://,
etc.); the previous blockchain_peerRegistryPath string setting is gone.

- savePeerRegistry/loadPeerRegistry take (ctx, blob.Store, ...) and
  use Set/Get/Exists/Del. saveMu still serializes the snapshot+write
  so concurrent saves cannot interleave.
- Save passes options.WithAllowOverwrite(true) because the registry is
  a single mutable blob, not append-only content.
- Corrupt blob handling drops the entry via store.Del instead of
  renaming a sibling .corrupted file.
- Server.Start constructs the blob.Store via blob.NewStore using
  blobstoretypes.PEERREGISTRYSTORE (new constant) and stores the
  reference on Blockchain. Stop saves through it then closes it.
- Existing tests rewritten to use an in-memory blob.Store helper
  (newTestBlobStore) instead of TempDir() + filesystem assertions.

1081 unit tests pass with -race; golangci-lint clean.
@oskarszoon oskarszoon requested a review from icellan May 20, 2026 10:44
- recordReceivedSuccessLocked: add InteractionAttempts++ so RecordBlockReceived
  and RecordSubtreeReceived paths contribute to totalAttempts in reputation
  calculation. Previously only UpdateMetrics paths counted attempts, producing
  an incorrect success rate when block/subtree paths were the primary source of
  positive signals.

- CentralizedPeerRegistry.Get: use RLock fast path; upgrade to write lock only
  when the target peer carries an expired ban that needs normalising. Avoids
  serialising every GetPeer call (shouldSkipUnhealthyPeer, getPeer, dashboard
  reads) through the write lock under normal operation.

- Register: stop bumping LastMessageTime unconditionally on metadata updates.
  LastMessageTime is now only set on new peers (matching LastSeen semantics).
  Unconditional updates prevented TTL eviction for peers that send metadata
  but no actual wire messages.

- isPeerBannedLocked: accept now time.Time param so callers sharing a timestamp
  avoid a time.Now() call per peer in hot loops. Update List() to pass a shared
  timestamp.

- shouldSkipUnhealthyPeer: cache reputation scores for 5 seconds per peer to
  avoid a gRPC round-trip on every pubsub message. Misses fall back to the
  registry as before.

- daemon/Stores.GetPeerRegistryClient: cache the gRPC connection like other
  singleton clients; previously a new connection was opened on each call.

- p2p/Server: rename getPeerCacheFilePath -> p2pCacheFilePath and update
  comment/filename to reflect that this is the libp2p peer-address cache, not
  the Teranode peer registry (now owned by the blockchain service).

- Server.New BanConfig: document why DecayInterval/DecayAmount are hardcoded
  to defaults rather than wired from settings.

- UpdateCatchupReputation: extend no-op comment to name the legacy consumer
  and note the removal condition.

- peer_registry_persistence.Load: add comment explaining why TTL filter uses
  LastSeen only rather than peerActivity() max.

- Fix PR body typo: blockchain_peerRegistryPath -> blockchain_peerRegistryStore
  in two places (operator notes and what's-not-in-this-PR sections).
@oskarszoon oskarszoon self-assigned this May 25, 2026
@icellan

icellan commented May 27, 2026

Copy link
Copy Markdown
Contributor

Code Review

Reviewed against origin/main. Build clean, go vet clean, go test -race ./services/blockchain/... ./services/p2p/... all green. The refactor is sound architecturally and the test coverage on the new code is good. Two issues should be addressed before merge, plus a handful of follow-ups.


Blocking — should fix before merge

1. Register does not reconcile re-attached ban stateMedium

services/blockchain/peer_registry.go:167-221

Remove deliberately preserves banScores[peerID] (correct: "bans must outlive peer disconnects"). But on reconnect, Register creates a fresh PeerInfo with IsBanned=false / BanScore=0 and never consults banScores:

if !exists {
    entry := *info       // caller's info has IsBanned=false, BanScore=0
    ...
    r.peers[info.ID] = &entry
    return
}

IsBannedPeer() is still correct (it reads banScores), and List(excludeBanned=true) is safe because isPeerBannedLocked also reads banScores. But the returned PeerInfo snapshot is stale: Get/List callers see IsBanned=false, BanScore=0 for a peer the registry actually considers banned. This is the same skew the Load path explicitly reconciles in peer_registry_persistence.go:198-213Register should do the equivalent:

if banEntry, ok := r.banScores[info.ID]; ok && banEntry.Banned {
    entry.IsBanned = true
    entry.BanScore = banEntry.Score
}

This affects dashboards, getpeerinfo RPC, and any code reading info.IsBanned directly — including isViableSyncCandidate in services/p2p/sync_coordinator.go, which means a freshly re-registered banned peer would pass viability checks until its next AddBanScore.

2. reputationCache grows unboundedlyMedium

services/p2p/server_helpers.go:918-961

shouldSkipUnhealthyPeer stores into s.reputationCache (a sync.Map) on every miss but never deletes anything. Disconnected peers, gossip relays that never reappear, transient bad-IDs — all accumulate forever. The 5s TTL only decides whether to use the value; expired entries are overwritten in place if the same peer is seen again, but otherwise stay resident. With peer churn this is a slow memory leak keyed by untrusted IDs.

Suggested fix: piggyback eviction on the existing peerMapCleanup loop, or sweep expired entries periodically.


Follow-ups (nice-to-have)

3. savePeerRegistryPeriodically not joined on shutdownLow

services/blockchain/Server.go (added in this PR)

peerRegistry.Close() drains ban-decay and TTL-cleanup goroutines via stopCh, but the periodic-save goroutine is only stopped via ctx.Done() on the Start context. Final Save in Stop is serialized via saveMu, so it's safe, but a save tick can fire after Stop returns. For symmetry, track it on the registry's wg.

4. persistedRegistry.Version parsed but never enforcedLow

services/blockchain/peer_registry_persistence.go:42, 92

persistedRegistryVersion = 1 but loadPeerRegistry never checks envelope.Version. A future-version blob is silently accepted with unknown fields ignored. Either reject Version > persistedRegistryVersion or remove the field until it's needed.

5. AddBanScore comment misleadingLow

services/blockchain/peer_registry.go:591-594

// Look up points for this reason, default to provided points
if configPoints, found := r.banConfig.ReasonPoints[reason]; found {
    points = configPoints
}

Behaviour is actually "config overrides caller". All P2P callsites pass 0, so functionally fine, but the gRPC API exposes the parameter — external callers would be surprised their value is silently discarded for known reasons. Fix the comment or rename the param to defaultPoints.

6. os.Stderr warnings bypass loggerLow

services/blockchain/peer_registry_grpc.go:284, services/blockchain/peer_registry_persistence.go:96

Two fmt.Fprintf(os.Stderr, ...) warnings bypass the structured logger and break the project's single-line-log convention. Pass a ulogger.Logger or surface the error to the caller.


Observations (not actionable)

  • Register of a new peer copies all caller-supplied counters verbatim (BytesSent, MaliciousCount, etc.) and only resets ReputationScore. P2P callers always pass fresh values, but the API would accept seeded state from any gRPC client.
  • List() always takes the write lock to run expireBansLocked inline. Listing is currently low-frequency so it's fine; if it becomes hot, an "any-expired-bans?" fast path could keep it on the read lock.

Verdict

Approve with changes. Fix #1 and #2; the rest can land separately.

@icellan icellan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment

…egistry

Resolves conflicts in services/p2p/Server.go and services/p2p/server_handler_test.go.

Server.go: kept centralized-registry refactor (no local PeerRegistry / BanManager /
cleanup ticker fields). Adopted upstream's new invalidPolicyWarnOnce sync.Once.

server_handler_test.go: dropped upstream's ~980 lines of new tests because they
were written against the now-removed services/p2p.PeerRegistry. Ported the
oversized-message size-guard tests (the only block whose subject is upstream's
new functionality rather than registry plumbing) into a new
server_handler_size_test.go using blockchain.NewLocalPeerRegistryClient so the
4 size guards (rejected_tx / node_status / block / subtree) remain covered.

Verified: go build ./... clean, 1101 tests pass with -race across
services/blockchain and services/p2p.
… feat/central-peer-registry

# Conflicts:
#	services/p2p/Server.go
1. Register reconciles surviving banScores on new-entry path
(services/blockchain/peer_registry.go). Previously a peer evicted by
TTL/restart/Cleanup could re-attach via a fresh Register call with
IsBanned=false / BanScore=0, even though banScores still carried the
operator's ban. The new-entry branch now trusts banScores over the
caller-supplied fields, matching the Load reconciliation pattern.

2. reputationCache in shouldSkipUnhealthyPeer is now swept on every
peerMapCleanup tick (services/p2p/server_helpers.go). The sync.Map
was insert-only, so every unique gossiped peer ID accumulated an entry
for the process lifetime. Eviction piggybacks on the existing cleanup
goroutine — no new ticker.

Tests:
- TestRegister_ReconcilesSurvivingBanState — ban survives Remove, then
  Register with stale IsBanned=false still resolves to IsBanned=true.
- TestRegister_NoBanScoreSetsCleanState — caller's stale truthy claims
  are rejected when banScores has no entry.
- TestCleanupPeerMaps_EvictsExpiredReputationEntries — expired entry
  evicted, fresh entry survives.

1104 tests pass with -race across services/blockchain and services/p2p.
Comment thread services/blockchain/peer_registry_persistence.go
Comment thread services/blockchain/peer_registry_persistence.go Outdated
Comment thread services/blockchain/peer_registry.go
1. Save deep-copies BlockHash like Register does
(peer_registry_persistence.go). Today no path mutates the underlying
[32]byte (writers replace the pointer), but matching Register's pattern
removes the aliasing footgun for any future code path that touches
chainhash.Hash in place.

2. Logger + sidecar archive on corrupt blob
(peer_registry.go, peer_registry_persistence.go). The previous os.Stderr
fprintf bypassed the structured logger and the corrupt blob was deleted
outright with no forensic copy. Registry now carries an optional logger
(SetLogger; defaults via ulogger.New if unset) used at ERROR level for
corruption events. Before deleting the primary key, the bytes are
archived to a timestamped sidecar key peer-registry.corrupt-<unix> so
operators can post-mortem the failure.

3. Precision-preserving LastDecay advance (peer_registry.go AddBanScore
and decayBanScores). LastDecay used to clamp to "now" whenever decay
applied, silently discarding the sub-interval remainder
(decaySteps*interval < elapsed < (decaySteps+1)*interval) and slowly
extending effective ban duration. Now advances by exactly
decaySteps*DecayInterval, carrying the remainder into the next call.

Tests:
- TestPersistence_CorruptBlobArchivedToSidecar: corrupt blob copied to
  peer-registry.corrupt-<unix>, primary key removed, archive payload
  matches original bytes.
- TestBanDecay_LastDecayPreservesSubInterval: elapsed 3.5s with 1s
  interval → 3 steps applied, LastDecay = anchor+3s (not now).
- TestBanDecay_NoDriftAcrossManyCalls: 5 ticks across ~55ms with 10ms
  interval — total score loss matches elapsed/interval, no cumulative
  drift.

1107 tests pass with -race across services/blockchain and services/p2p.
@icellan

icellan commented May 28, 2026

Copy link
Copy Markdown
Contributor

Re-review

Both blocking items from the previous round are addressed, along with three additional fixes from the Copilot review. All tests pass under -race on services/blockchain and services/p2p. Ready to merge.

Blockers — resolved

1. Register reconciles surviving banScores (peer_registry.go:194-203)

New-entry branch now consults banScores and trusts it over the caller-supplied IsBanned/BanScore. Symmetrically: caller-supplied truthy values are rejected when no banScores entry exists. Mirrors the Load reconciliation pattern. Covered by TestRegister_ReconcilesSurvivingBanState and TestRegister_NoBanScoreSetsCleanState.

2. reputationCache eviction ✅ (server_helpers.go:828-843)

Expired entries swept on every peerMapCleanup tick — piggybacks on the existing cleanup goroutine, no new ticker. Concurrent refresh-then-delete is benign (worst case: one extra gRPC round-trip). Covered by TestCleanupPeerMaps_EvictsExpiredReputationEntries.

Additional fixes from Copilot review — all good

3. Save deep-copies BlockHash (peer_registry_persistence.go:147-151) — defensive symmetry with Register; today no writer mutates the array in place, so this is purely future-proofing.

4. Structured logger + sidecar archive on corrupt blob (peer_registry.go, peer_registry_persistence.go) — os.Stderr is gone. New SetLogger/log() pattern with a ulogger.New(\"CentralizedPeerRegistry\") fallback. Corrupt blobs are archived to peer-registry.corrupt-<unix> before deletion so operators can post-mortem. Covered by TestPersistence_CorruptBlobArchivedToSidecar.

5. Precision-preserving LastDecay (peer_registry.go:619, 788) — entry.LastDecay.Add(decaySteps * DecayInterval) instead of = now, so the sub-interval remainder carries forward. Prevents slow ban-extension drift over many calls. Covered by TestBanDecay_LastDecayPreservesSubInterval and TestBanDecay_NoDriftAcrossManyCalls.

Nits (not blocking)

Verification

Check Result
go vet ./services/blockchain/... ./services/p2p/... clean
go build ./services/blockchain/... ./services/p2p/... ./daemon/... clean
go test -race ./services/blockchain/ PASS (35.6s)
go test -race ./services/p2p/... PASS (13.2s)
New targeted tests all PASS

Verdict

Approve. Blockers resolved, fixes are clean and well-tested.

@oskarszoon oskarszoon requested a review from icellan May 29, 2026 09:57
@sonarqubecloud

Copy link
Copy Markdown

@oskarszoon oskarszoon merged commit 232d886 into bsv-blockchain:main May 29, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants