fix(blockassembly): fall back to row-oriented AddTxBatch on Unimplemented#897
Conversation
…nted During a rolling deploy where one block-assembly replica has been updated to expose AddTxBatchColumnar (bsv-blockchain#889) and another has not, gRPC returns codes.Unimplemented from the older server. Before this fix that propagated as a hard error and dropped the batch. Now the client transparently retries the same batch via the row-oriented sendBatchRowOriented path. Other errors continue to surface to the caller. Cost: one wasted RPC round-trip per batch against a persistently- old server; harmless during the deploy window and immediately moot once all replicas are upgraded. No metric or sticky disable is added in this commit — flagged in bsv-blockchain#828 review for a follow-up. Surfaced as a follow-up split out of PR bsv-blockchain#828; reviewable on its own.
|
🤖 Claude Code Review Status: Complete Current Review:
History:
The implementation is correct and follows the project's error handling patterns. The fallback logic properly distinguishes between Unimplemented (graceful fallback) and other errors (propagate to caller). Debug logging provides visibility during rolling deploys. Consider adding test coverage for the fallback path as noted in the inline comment. |
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-05-19 13:44 UTC |
A rolling-deploy mismatch where one block-assembly replica predates bsv-blockchain#889 silently drops to the row path; without a log line operators investigating "why is this batch slow" can't tell the fallback fired. Debug level keeps it out of normal log volume but available when needed.
|
Addressed the observability suggestion in 8f991f6: added Skipping the test suggestion for this PR — |
|
Brings in fff6d3e..origin/main (4 commits since previous merge): - #897 blockassembly columnar->row fallback on Unimplemented - #896 asset on-demand subtree-data admission control + 503/Retry-After - #766 utxo: verify spend ownership in Unspend - #700 health: disk space check Conflict resolutions: - services/blockassembly/Client.go, util/http.go, util/http_test.go: take theirs. Main has the post-Copilot-fix versions of these files from #896 and #897 (parseRetryAfter strict-Atoi, debug log on columnar fallback, extended TestParseRetryAfter cases) — they supersede the pre-fix versions still living on pr-828. - stores/utxo/aerospike/un_spend.go: take ours (native-op call, no SpendingData arg). The ownership check #766 added to the UDF path requires a coordinated update to the BSV-forked aerospike- server's subOpUnspend dispatcher. Until that lands, the native-op path on this branch skips ownership verification. Tracking issue filed.



Split out of #828 so it can be reviewed independently.
Summary
During a rolling deploy where one block-assembly replica has been updated to expose
AddTxBatchColumnar(introduced in #889) and another has not, gRPC returnscodes.Unimplementedfrom the older server. Before this change that error propagated as a hard failure and the batch was dropped. With this change the client transparently retries the same batch via the existing row-orientedsendBatchRowOrientedpath. Other errors continue to surface to the caller untouched.Cost
Follow-up (not in this PR)
Test plan
go build ./...cleango vet ./services/blockassembly/...cleanmake lint-newcleancodes.UnimplementedforAddTxBatchColumnarcauses the client to drop down to the row path with no error surfaced; verify a server that returns any OTHER gRPC code still surfaces the error.