fix(aerospike): stop enabling client metrics (#1001)#1003
Conversation
|
🤖 Claude Code Review Status: Complete Current Review: No issues found. The change is correct and well-documented. Summary:
The implementation aligns with AGENTS.md guidelines: minimal change, clear rationale, correctness over speed. The existing test at |
ordishs
left a comment
There was a problem hiding this comment.
Approve. Verified against the aerospike-client-go fork (v8.7.1-bsv3): the per-command metrics path is fully gated behind cluster.metricsEnabled, which only flips true via EnableMetrics — removing the call leaves it false so the quadratic updateOrInsert path is never entered. client.Stats() reads connection/node counts directly and is independent of EnableMetrics, so the poller still exports connection-pool and node-count series; only per-command latency histograms and result-code counters go dark, as described. Sound, low-risk stop-gap. Follow-up: ensure the re-enable is tracked against #1001 and give dashboard owners a heads-up about the zeroed series.
Stop calling client.EnableMetrics(nil) in initStats. The fork's per-record stats path (nodeStats.updateOrInsert) is quadratic per batch: batch commands have getNamespace()==nil, so updateOrInsert iterates every record's namespace via nsIter, and it is already called once per record. An N-record batch does ~N^2 increments against a sync.RWMutex map. This is a general cost on every batch command; mainnet IBD just exposes it most (44-60% of legacy CPU) because batches are largest, concurrency highest, and aerospike is local so there is no network latency to mask it. The hot path is gated behind cluster.metricsEnabled, which only teranode flips on. Removing the call leaves it false and the path is never entered. Connection-pool and node-count metrics are recorded unconditionally, so the poller still exports them; only the per-command latency histograms and result-code counts go dark. Stop-gap. Real fix is in the aerospike-client-go fork, tracked in bsv-blockchain#1001. Guard initStats against a nil client: the old EnableMetrics call panicked synchronously on nil, which TestInitStatsBasic relied on via recover(). With it gone the nil-deref moved into the background poller goroutine (uncatchable), so add an explicit early return.
2221c6b to
d0b7d72
Compare
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-06-01 14:53 UTC |
|



Stop calling
client.EnableMetrics(nil)ininitStats. The fork's per-record stats path (nodeStats.updateOrInsert) is quadratic per batch — batch commands havegetNamespace()==nil, soupdateOrInsertiterates every record's namespace viansIter, and it is already called once per record. An N-record batch does ~N² increments against async.RWMutexmap. This is a general cost on every batch command; mainnet IBD just exposes it most (44–60% of legacy CPU) because batches are largest, concurrency highest, and aerospike is local so there is no network latency to mask it.The hot path is gated behind
cluster.metricsEnabled, which only teranode flips on. Removing the call leaves itfalseand the path is never entered. Connection-pool and node-count metrics are recorded unconditionally, so the poller still exports them — only the per-command latency histograms and result-code counts go dark.Stop-gap. Real fix is in the
aerospike-client-gofork, tracked in #1001.