Skip to content

fix(kafka): unwind dual flush_frequency linger; cut linger to 10ms on high-fanout topics#840

Closed
freemans13 wants to merge 48 commits into
bsv-blockchain:feat/teranode-native-opsfrom
freemans13:fix/kafka-flush-frequency-linger-regression
Closed

fix(kafka): unwind dual flush_frequency linger; cut linger to 10ms on high-fanout topics#840
freemans13 wants to merge 48 commits into
bsv-blockchain:feat/teranode-native-opsfrom
freemans13:fix/kafka-flush-frequency-linger-regression

Conversation

@freemans13

@freemans13 freemans13 commented May 11, 2026

Copy link
Copy Markdown
Collaborator

Summary

Two complementary fixes for the txmeta producer-side latency regression that landed when the kafka client was switched from Sarama to franz-go (#611). Together they take p99 publish→consume latency from ~7 s to ~22 ms on the regression test. Already deployed to dev-scale-1/2 and confirmed working at production load — see "Post-deploy validation" below.

Commit 1: cut flush_frequency to 10ms on high-fanout topics

The Sarama → franz-go switch silently re-wired the URL query parameter flush_frequency from "max time between flushes" to franz-go's per-partition kgo.ProducerLinger. On the high-fanout txmeta topic (256 partitions, ~5 batched msgs/s/partition at peak), each partition rarely fills 1 MiB before the 1 s linger, so every record paid up to ~1 s of producer-side delay. The subtree-validator's local cache lagged the validator by 1–2 s and every subtree triggered ProcessTxMetaUsingCacheThresholdExceededError → 1 s RetrySleep → retry.

Drops flush_frequency from 1s to 10ms on the three high-fanout topics: kafka_txmetaConfig, kafka_validatortxsConfig.operator, kafka_legacyInvConfig. Low-volume topics (invalidBlocks, rejectedTx, unitTest) keep their existing 1s.

Commit 2: decouple outer batcher linger from flush_frequency

A subtler footgun in the same area: KafkaProducerConfig.FlushFrequency was driving two lingers at once — franz-go's per-partition ProducerLinger (the user-facing knob), and the outer async-batcher's straggler-flush timer (an internal implementation detail). Setting flush_frequency=1s stacked two 1-second lingers on the same publish path.

Introduces a new URL query param outer_batcher_linger (field: OuterBatcherLinger, default 10ms) controlling only the outer batcher. flush_frequency now controls only kgo.ProducerLinger, which is what an operator looking at the URL expects.

Pre-deploy evidence

Production (Prometheus, dev-scale-1/2, Friday May 8 2026, 18:00–21:00 UTC at 1.28 M TPS peak):

  • Only txmeta-dev-scale-1-scale-1 has producer-buffer backlog (mean ≈ 72 k msgs across 20 propagation pods, peak 186 k). Every other topic stays at 0.
  • teranode_kafka_producer_produce_request_latency_seconds p99 ≈ 642 ms, p50 ≈ 87 ms.
  • validate_subtree_retry rate ≈ 1–2 / s, matching the ~1.2 subtree/s rate — basically every subtree retries.
  • validate_subtree_duration p99 mean = 16 s, max 127 s.
  • bless_missing_transaction_count rate = 0 — retries always eventually succeed; the cache does fill, it just lags.

TestLingerLatencyRegression (OrbStack-backed Redpanda, 32-partition topic, 200 records 25 ms apart):

Code state flush_frequency=1s p50 flush_frequency=1s p99
Before either fix (stacked outer + franz-go linger) 4.49 s 6.95 s
After commit 2 only (single franz-go linger) 513 ms 1.01 s
After commit 1 (flush_frequency=10ms in settings.conf) 21 ms p50 22 ms p99

Post-deploy validation

The matching configmap patch (flush_frequency=1sflush_frequency=5ms on txmeta and legacyInv) was applied to dev-scale-1/2 at 2026-05-11 11:27 UTC. After ~22 min of sustained ~1.30 M TPS:

Metric Pre-fix (Fri peak) Post-fix (Mon under load) Change
Consumer rate variance 200 k – 2.2 M/s (260% range, visible "gaps") 1.28 – 1.32 M/s (3% range, smooth) gaps gone
Producer buffered (txmeta) mean 72 k, peak 186 k max 2 ~100 000× lower
Producer e2e latency p99 mean 408 ms, max 1.6 s 63 ms (flat) ~26× lower
Broker write latency p99 mean 642 ms, max 1.8 s 63 ms (flat) ~28× lower
Subtree-validator goroutines mean 28–36 k, peaks 185 k–693 k stable 3.8–4.0 k ~175× lower
/metrics scrape duration mean ~140 ms, max 10 s (timing out) 5–11 ms endpoint healthy
validate_subtree_retry rate mean 0.9/s, peak 2.3/s (≈ 2 attempts/subtree) mean 0.94/s = floor of 1/subtree retries gone
validate_subtree_duration p99 mean 16 s, max 127 s 1.9 s and trending ~8× faster

The "Tx Meta read from Kafka /second" Grafana panel is now flat at ~1.3 M/s on both pods — no near-zero dips, no scrape-induced "gaps". That panel's behaviour was the originating symptom.

One thing flagged for monitoring, not a regression: bless_missing_transaction_count is now firing at very low rates (mean 0.23/s on scale-1, 0.85/s on scale-2 with one 20.76/s burst) where it was zero before. Pre-fix that path never fired because the ThresholdExceededError → 1 s retry short-circuited every cache miss. Post-fix, the retry doesn't trigger, so genuine cache misses fall through to the legitimate "fetch from UTXO store" path. The miss rate is microscopic (≈0.00007% of txs), so this is fine — but if it grows it's the right alarm signal to surface, because it'll mean the cache is undersized rather than being masked by the retry loop.

Follow-ups (intentionally out of scope here)

  1. dev-scale-1/2 configmap update. ✅ Already applied (with flush_frequency=5ms rather than the 10ms in this PR's defaults — both work). scale-1-shared-config.kafka_txmetaConfig in teranode-argocd-deployments was patched; the matching PR there should be linked.
  2. 256 partitions for one Redpanda broker is over-provisioned for the actual record rate; consider dropping to 32–64. Not required for this fix, but a contributing factor to the slow broker write p99 (now moot under low-linger config).

Test plan

  • go vet ./util/kafka/ clean.
  • go build ./util/kafka/ clean.
  • go test -short -count=1 ./util/kafka/ passes (all unit tests, including new TestNewKafkaAsyncProducerFromURLOuterBatcherLinger cases).
  • go test -v -run TestLingerLatencyRegression -timeout 5m ./util/kafka/ passes locally with the numbers above.
  • Deployed to dev-scale-1/2; metrics confirm the regression is resolved (see "Post-deploy validation").
  • Reviewer to confirm no settings_local.conf override for terabuild / mainnet / testnet / teratestnet relies on the old semantic.

🤖 Generated with Claude Code

freemans13 and others added 4 commits May 11, 2026 09:00
…k order (bsv-blockchain#717)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The franz-go switch rewired the URL query param `flush_frequency` from
Sarama's "max time between flushes" to franz-go's `kgo.ProducerLinger`,
which is a PER-PARTITION linger. On the dev-scale-1/2 txmeta topic
(256 partitions, ~5 batched msgs/s/partition at 1.2M TPS peak) each
partition rarely fills 1MiB before the 1s linger, so every record paid
~1s of producer-side delay. The subtree-validator's local cache lagged
the validator by 1-2s and every subtree hit ProcessTxMetaUsingCache's
ThresholdExceededError -> 1s RetrySleep -> retry — visible as the
"validate_subtree_retry" rate matching the subtree rate and as gaps in
the "Tx Meta read from Kafka /second" Grafana panel.

  - settings.conf: txmeta, validatortxs.operator, legacyInv get
    flush_frequency=10ms (was 1s). Low-volume topics (invalidBlocks,
    rejectedTx, unitTest) keep 1s; their per-partition rate is low
    enough that latency doesn't matter.
  - util/kafka/kafka_producer_async.go: documentation block at the
    franz-go option site explaining the Sarama->franz-go semantic
    shift for each `flush_*` URL param, so the next operator doesn't
    re-introduce this.
  - util/kafka/linger_latency_regression_test.go: regression test
    spinning up Redpanda via testcontainers, demonstrating that
    flush_frequency=1s produces p50 latency 200x larger than
    flush_frequency=10ms on a 32-partition topic with sparse feed.

NOTE: dev-scale-1/2 configmaps override flush_frequency=1s explicitly
in scale-1-shared-config.kafka_txmetaConfig — those need a matching
update in the teranode-argocd-deployments repo for the fix to land
in those clusters.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented May 11, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete


Summary

This PR addresses a critical Kafka producer latency regression introduced during the Sarama → franz-go migration. The changes successfully decouple two stacked lingers that caused 1-7s publish→consume delays at production scale.

Key Changes Validated

Core Kafka Fix (util/kafka/):

  • ✅ Introduces OuterBatcherLinger field, decoupled from FlushFrequency
  • ✅ Correctly defaults to 10ms when not specified
  • ✅ Comprehensive unit test coverage verifying decoupling
  • ✅ New regression test (TestLingerLatencyRegression) with clear hypothesis

Configuration Updates (settings.conf):

  • ✅ Reduces flush_frequency from 1s → 10ms on high-fanout topics (txmeta, validatortxs, legacyInv)
  • ✅ Low-volume topics correctly retain 1s linger
  • ✅ Changes match production deployment that resolved the issue

Documentation (docs/topics/services/legacy.md):

  • ✅ Accurately describes checkpoint-based validation optimization
  • ✅ Clear distinction between checkpointed vs. non-checkpointed block handling

Concerns

[Minor] Documentation Scope:
The openapi/CHANGES.md file describes itself as "This PR" but references a different branch name (gokhan/swagger-rpc) than the actual PR branch. The OpenRPC spec addition appears unrelated to the Kafka linger fix — it should either be in a separate PR or the description should clarify why it is bundled here.

Production Validation

The PR description includes strong production evidence from dev-scale-1/2:

  • Producer latency p99: 642ms → 63ms (~10× improvement)
  • Subtree validation retries: eliminated
  • Buffered message backlog: 72k → 2 (~36,000× reduction)

The fix has already been deployed and validated at 1.3M TPS for 22+ minutes with no regressions.

Recommendation

Approve — The core Kafka changes are correct, well-tested, and production-proven. The documentation accuracy is good. Consider splitting the OpenRPC changes into a separate PR for clearer change tracking.

Splits the single FlushFrequency knob that previously drove both
franz-go's per-partition ProducerLinger AND the outer async-batcher's
straggler-flush timer. A new URL query param `outer_batcher_linger`
(field: OuterBatcherLinger, default 10ms) controls only the outer
batcher; `flush_frequency` now controls only kgo.ProducerLinger, which
is what an operator looking at the URL expects.

Without this fix, setting flush_frequency=1s — which on the dev-scale
clusters was the intent of "match Sarama's 1s Flush.Frequency" — stacked
two lingers on the same publish path. The regression test (sparse feed,
32 partitions) goes from p50=4.49s/p99=6.95s with the stacked behaviour
to p50=513ms/p99=1.01s with the franz-go linger alone (and to ~22ms p99
once flush_frequency is also lowered). The settings.conf change in the
first commit on this branch handles the second of those steps; this
change handles the first.

Adds unit-test coverage that:
  - the new URL param parses and applies (250ms test value),
  - flush_frequency=1s no longer influences OuterBatcherLinger.

Updates the integration test commentary to reflect that the outer
batcher's linger no longer stacks with franz-go's.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@freemans13 freemans13 changed the title fix(kafka): cut flush_frequency to 10ms on high-fanout topics fix(kafka): unwind dual flush_frequency linger; cut linger to 10ms on high-fanout topics May 11, 2026
@sonarqubecloud

Copy link
Copy Markdown

@github-actions

github-actions Bot commented May 11, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-840 (531d239)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 142
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.610µ 1.893µ ~ 0.400
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 71.23n 71.23n ~ 1.000
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 71.22n 71.27n ~ 1.000
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 71.28n 71.28n ~ 0.800
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 38.67n 38.58n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 57.74n 58.75n ~ 0.400
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 152.2n 185.9n ~ 0.100
MiningCandidate_Stringify_Short-4 225.4n 220.9n ~ 0.100
MiningCandidate_Stringify_Long-4 1.664µ 1.663µ ~ 1.000
MiningSolution_Stringify-4 847.7n 863.4n ~ 0.200
BlockInfo_MarshalJSON-4 1.754µ 1.807µ ~ 0.100
NewFromBytes-4 141.5n 127.6n ~ 0.100
Mine_EasyDifficulty-4 60.61µ 60.58µ ~ 1.000
Mine_WithAddress-4 6.730µ 6.666µ ~ 0.100
DirectSubtreeAdd/4_per_subtree-4 58.32n 61.30n ~ 0.100
DirectSubtreeAdd/64_per_subtree-4 28.27n 31.61n ~ 0.100
DirectSubtreeAdd/256_per_subtree-4 27.27n 30.84n ~ 0.100
DirectSubtreeAdd/1024_per_subtree-4 26.20n 29.29n ~ 0.100
DirectSubtreeAdd/2048_per_subtree-4 25.99n 28.89n ~ 0.100
SubtreeProcessorAdd/4_per_subtree-4 279.7n 283.1n ~ 0.700
SubtreeProcessorAdd/64_per_subtree-4 275.6n 277.2n ~ 0.100
SubtreeProcessorAdd/256_per_subtree-4 279.0n 280.4n ~ 0.400
SubtreeProcessorAdd/1024_per_subtree-4 268.6n 272.1n ~ 0.100
SubtreeProcessorAdd/2048_per_subtree-4 267.0n 274.0n ~ 0.100
SubtreeProcessorRotate/4_per_subtree-4 273.9n 276.6n ~ 0.100
SubtreeProcessorRotate/64_per_subtree-4 273.7n 277.4n ~ 0.100
SubtreeProcessorRotate/256_per_subtree-4 273.2n 276.0n ~ 0.100
SubtreeProcessorRotate/1024_per_subtree-4 273.0n 280.3n ~ 0.100
SubtreeNodeAddOnly/4_per_subtree-4 54.35n 55.56n ~ 0.100
SubtreeNodeAddOnly/64_per_subtree-4 34.30n 34.53n ~ 0.300
SubtreeNodeAddOnly/256_per_subtree-4 33.36n 33.43n ~ 0.700
SubtreeNodeAddOnly/1024_per_subtree-4 32.70n 32.78n ~ 0.400
SubtreeCreationOnly/4_per_subtree-4 114.6n 113.7n ~ 0.700
SubtreeCreationOnly/64_per_subtree-4 401.6n 403.6n ~ 1.000
SubtreeCreationOnly/256_per_subtree-4 1.338µ 1.483µ ~ 0.100
SubtreeCreationOnly/1024_per_subtree-4 4.349µ 4.431µ ~ 0.200
SubtreeCreationOnly/2048_per_subtree-4 8.011µ 8.402µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 268.2n 270.6n ~ 0.400
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 269.3n 270.4n ~ 0.700
ParallelGetAndSetIfNotExists/1k_nodes-4 804.2µ 584.6µ ~ 0.100
ParallelGetAndSetIfNotExists/10k_nodes-4 1.577m 1.336m ~ 0.100
ParallelGetAndSetIfNotExists/50k_nodes-4 6.732m 6.747m ~ 0.700
ParallelGetAndSetIfNotExists/100k_nodes-4 13.63m 13.64m ~ 1.000
SequentialGetAndSetIfNotExists/1k_nodes-4 653.4µ 665.1µ ~ 0.100
SequentialGetAndSetIfNotExists/10k_nodes-4 2.783m 2.777m ~ 1.000
SequentialGetAndSetIfNotExists/50k_nodes-4 10.38m 10.46m ~ 0.100
SequentialGetAndSetIfNotExists/100k_nodes-4 19.90m 19.85m ~ 1.000
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 637.7µ 630.7µ ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 4.263m 4.167m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 16.74m 16.66m ~ 1.000
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 701.2µ 704.7µ ~ 1.000
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 5.912m 5.798m ~ 0.400
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 37.44m 38.06m ~ 0.100
DiskTxMap_SetIfNotExists-4 3.735µ 3.970µ ~ 1.000
DiskTxMap_SetIfNotExists_Parallel-4 3.606µ 3.562µ ~ 0.700
DiskTxMap_ExistenceOnly-4 336.9n 312.8n ~ 0.200
Queue-4 186.2n 185.8n ~ 0.700
AtomicPointer-4 3.670n 3.279n ~ 0.100
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 817.8µ 833.1µ ~ 0.200
ReorgOptimizations/DedupFilterPipeline/New/10K-4 776.0µ 771.5µ ~ 0.400
ReorgOptimizations/AllMarkFalse/Old/10K-4 122.8µ 115.0µ ~ 0.700
ReorgOptimizations/AllMarkFalse/New/10K-4 64.46µ 64.86µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/10K-4 56.75µ 61.53µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/10K-4 10.94µ 11.03µ ~ 1.000
ReorgOptimizations/NodeFlags/Old/10K-4 4.469µ 4.466µ ~ 1.000
ReorgOptimizations/NodeFlags/New/10K-4 1.572µ 1.572µ ~ 1.000
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 9.338m 9.431m ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/100K-4 10.106m 9.975m ~ 1.000
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.116m 1.175m ~ 0.100
ReorgOptimizations/AllMarkFalse/New/100K-4 702.7µ 705.8µ ~ 0.400
ReorgOptimizations/HashSlicePool/Old/100K-4 461.5µ 578.5µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/100K-4 205.7µ 201.4µ ~ 0.400
ReorgOptimizations/NodeFlags/Old/100K-4 46.28µ 47.69µ ~ 0.100
ReorgOptimizations/NodeFlags/New/100K-4 16.58µ 16.10µ ~ 0.400
TxMapSetIfNotExists-4 46.40n 46.41n ~ 0.500
TxMapSetIfNotExistsDuplicate-4 38.56n 38.73n ~ 0.100
ChannelSendReceive-4 606.7n 614.9n ~ 0.100
BlockAssembler_AddTx-4 0.03179n 0.03072n ~ 1.000
AddNode-4 11.70 12.07 ~ 0.700
AddNodeWithMap-4 12.28 12.43 ~ 1.000
CalcBlockWork-4 504.2n 470.1n ~ 0.100
CalculateWork-4 666.7n 633.9n ~ 0.700
BuildBlockLocatorString_Helpers/Size_10-4 1.319µ 1.331µ ~ 0.700
BuildBlockLocatorString_Helpers/Size_100-4 12.66µ 15.33µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_1000-4 157.9µ 124.8µ ~ 0.100
CatchupWithHeaderCache-4 104.4m 104.3m ~ 0.700
_BufferPoolAllocation/16KB-4 4.961µ 3.631µ ~ 0.400
_BufferPoolAllocation/32KB-4 8.632µ 7.879µ ~ 0.700
_BufferPoolAllocation/64KB-4 17.72µ 15.67µ ~ 0.700
_BufferPoolAllocation/128KB-4 32.74µ 32.43µ ~ 0.400
_BufferPoolAllocation/512KB-4 116.5µ 128.1µ ~ 0.100
_BufferPoolConcurrent/32KB-4 18.62µ 20.40µ ~ 0.100
_BufferPoolConcurrent/64KB-4 29.29µ 32.56µ ~ 0.100
_BufferPoolConcurrent/512KB-4 146.2µ 159.4µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/16KB-4 635.5µ 680.5µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 664.6µ 669.9µ ~ 1.000
_SubtreeDeserializationWithBufferSizes/64KB-4 656.0µ 665.4µ ~ 0.400
_SubtreeDeserializationWithBufferSizes/128KB-4 679.0µ 668.5µ ~ 0.700
_SubtreeDeserializationWithBufferSizes/512KB-4 669.7µ 686.7µ ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/16KB-4 36.27m 36.66m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/32KB-4 36.17m 36.50m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/64KB-4 36.11m 36.64m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/128KB-4 35.82m 36.44m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/512KB-4 35.82m 36.53m ~ 0.100
_PooledVsNonPooled/Pooled-4 740.8n 743.6n ~ 0.100
_PooledVsNonPooled/NonPooled-4 6.826µ 7.690µ ~ 0.200
_MemoryFootprint/Current_512KB_32concurrent-4 7.210µ 7.775µ ~ 0.100
_MemoryFootprint/Proposed_32KB_32concurrent-4 9.603µ 11.844µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 9.345µ 11.048µ ~ 0.100
SubtreeSizes/10k_tx_4_per_subtree-4 1.403m 1.359m ~ 0.700
SubtreeSizes/10k_tx_16_per_subtree-4 332.1µ 317.2µ ~ 0.200
SubtreeSizes/10k_tx_64_per_subtree-4 78.76µ 76.11µ ~ 0.100
SubtreeSizes/10k_tx_256_per_subtree-4 19.74µ 19.22µ ~ 0.100
SubtreeSizes/10k_tx_512_per_subtree-4 9.855µ 9.446µ ~ 0.100
SubtreeSizes/10k_tx_1024_per_subtree-4 4.902µ 4.634µ ~ 0.100
SubtreeSizes/10k_tx_2k_per_subtree-4 2.511µ 2.300µ ~ 0.100
BlockSizeScaling/10k_tx_64_per_subtree-4 80.58µ 73.12µ ~ 0.100
BlockSizeScaling/10k_tx_256_per_subtree-4 20.22µ 18.66µ ~ 0.100
BlockSizeScaling/10k_tx_1024_per_subtree-4 5.108µ 4.615µ ~ 0.100
BlockSizeScaling/50k_tx_64_per_subtree-4 433.1µ 396.0µ ~ 0.100
BlockSizeScaling/50k_tx_256_per_subtree-4 107.03µ 93.84µ ~ 0.100
BlockSizeScaling/50k_tx_1024_per_subtree-4 26.44µ 24.00µ ~ 0.100
SubtreeAllocations/small_subtrees_exists_check-4 180.3µ 164.9µ ~ 0.100
SubtreeAllocations/small_subtrees_data_fetch-4 179.7µ 170.0µ ~ 0.100
SubtreeAllocations/small_subtrees_full_validation-4 361.9µ 332.5µ ~ 0.100
SubtreeAllocations/medium_subtrees_exists_check-4 10.360µ 9.671µ ~ 0.100
SubtreeAllocations/medium_subtrees_data_fetch-4 10.72µ 10.27µ ~ 0.100
SubtreeAllocations/medium_subtrees_full_validation-4 20.82µ 19.52µ ~ 0.100
SubtreeAllocations/large_subtrees_exists_check-4 2.455µ 2.333µ ~ 0.100
SubtreeAllocations/large_subtrees_data_fetch-4 2.536µ 2.432µ ~ 0.100
SubtreeAllocations/large_subtrees_full_validation-4 5.186µ 4.843µ ~ 0.100
_prepareTxsPerLevel-4 394.6m 396.0m ~ 0.400
_prepareTxsPerLevelOrdered-4 3.950m 4.015m ~ 0.700
_prepareTxsPerLevel_Comparison/Original-4 406.4m 404.1m ~ 0.700
_prepareTxsPerLevel_Comparison/Optimized-4 3.606m 3.502m ~ 0.100
StoreBlock_Sequential/BelowCSVHeight-4 303.0µ 315.8µ ~ 0.100
StoreBlock_Sequential/AboveCSVHeight-4 312.7µ 313.8µ ~ 0.700
GetUtxoHashes-4 271.5n 274.2n ~ 1.000
GetUtxoHashes_ManyOutputs-4 45.81µ 46.28µ ~ 0.700
_NewMetaDataFromBytes-4 231.1n 230.5n ~ 0.700
_Bytes-4 616.8n 608.7n ~ 0.400
_MetaBytes-4 569.5n 558.6n ~ 0.700

Threshold: >10% with p < 0.05 | Generated: 2026-05-14 08:57 UTC

oskarszoon and others added 21 commits May 11, 2026 17:50
Co-authored-by: gokhan-sagirlar <gokhan.sagirlar@coinbase.com>
@freemans13 freemans13 self-assigned this May 14, 2026
@freemans13

Copy link
Copy Markdown
Collaborator Author

Closing in favour of #894, which is the same two-commit kafka fix but rebased onto main instead of feat/teranode-native-ops.

#894 contains only the focused 4-file change (settings.conf + 3 files in util/kafka/), with the same plain-English description, production validation, and benchmark numbers. The native-ops branch carried a lot of unrelated diff (149 files) that was making this PR hard to review as a standalone kafka fix.

Same production validation applies — already deployed and confirmed on dev-scale-1/2 since 2026-05-11.

@freemans13 freemans13 closed this May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants