feat(compose): add -allinone=0 split-per-service topology for chaos testing#932
Conversation
|
🤖 Claude Code Review Status: Complete Current Review: All previously reported critical bugs have been fixed:
Minor observations remain in existing inline comments (service name validation, unpause counter reporting, mode detection edge cases, documentation precision) - these are non-blocking and well-documented. Code Quality:
History:
|
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-05-27 09:28 UTC |
2e4fe9f to
816bfab
Compare
…esting `compose/multinode.sh up <N>` previously bundled every Teranode service into one container per node, so chaos commands could only kill the whole node at once. `-allinone=0` decomposes each logical node into one container per microservice (blockchain, blockassembly, blockvalidation, subtreevalidation, validator, propagation, p2p, asset, plus a `core` sidecar for rpc/alert/blockpersister/utxopersister/pruner/legacy), so individual services can be killed, paused, isolated, or slowed without taking down the rest of the node. Default behaviour is unchanged: `up <N>` without `-allinone=0` produces a bit-for-bit identical compose file (verified by diff -r against the pre-change generator). How it works: * gennodes accepts `-allinone=0` and renders an alternate compose template. Service topology is the single source of truth in Go via []splitService: name, entrypoint flags, ports, cap_add, mounts, healthcheck port. Adding a new service is one struct literal. * Settings overlay emits per-node *_grpcAddress overrides under the existing docker.teranode<N> settings context, routing inter-service traffic to sibling DNS names (e.g. blockchain_grpcAddress.docker. teranode1 = teranode1-blockchain:8087). * Per-service mount scoping: each container mounts the minimum data subdirs it actually uses, audited against daemon/daemon_services.go. blockchain and p2p mount zero data dirs; subtreevalidation is the only consumer of subtree_quorum; blockstore lives only on core. * Healthcheck on each container's primary listener port. The 8 services that dial blockchain at boot gate on service_healthy (not service_started), removing 20-60s of connection-refused log spam at startup. * multinode.sh extended: chaos kill/start/pause/unpause accept an optional service arg in split mode; isolate/slow target the -p2p container; status groups by node and tallies running services; logs accepts <N>-<svc> form; blast enumerates distinct nodes across split containers. * `up --skip N:svc` (repeatable, split mode only) brings a node up without the named service. Useful for chaos scenarios where a node should start missing a service rather than killing it post-up. The flag rejects --skip blockchain because sibling service_healthy deps would auto-resolve it. `chaos start <N> <svc>` switched from `compose start` to `compose up -d` so a --skip-ped (never-materialised) service can be brought up on demand. * compose/MULTINODE.md gained a Split topology section covering -allinone=0, the 9-service decomposition, and --skip, plus updates to the command and chaos tables. Tests: * TestGenSplitTopology asserts 9 services x N nodes, correct entrypoint flags, NET_ADMIN only on -p2p, per-service mount scoping (assertServiceMounts), healthcheck ports (assertHealthchecks), and that all 24 blockchain dependency gates use service_healthy. * TestGenAllInOneTopology asserts the default-mode output contains no split markers (regression guard). * Existing TestWriteAll_RendersCompleteBundle still passes (buildSpec defaults AllInOne=true so historical callers are unaffected).
816bfab to
626c4c7
Compare
CI lint failed on compose/cmd/gennodes/main_test.go with gci flagging an extra blank line at EOF.
|
Asset service mount issue (compose/cmd/gennodes/main.go:253) The asset service DataMounts list is incomplete. According to daemon/daemon_services.go:389-397, the asset service calls GetBlockPersisterStore(), which requires the blockstore directory to be mounted. Current: DataMounts: []string{"txstore", "subtreestore", "external"} The docstring at line 173 should also be updated to reflect this. |
ordishs
left a comment
There was a problem hiding this comment.
Review summary
Solid design overall. "Topology as data" (one buildSplitServices() slice as the single source of truth) is the right call, the template went 542 -> 169 lines, and the default -allinone=1 path is provably unchanged (template + settings gated on AllInOne, verified by diff -r). The healthcheck approach is also sound — nc is guaranteed present because the existing wait.sh entrypoint already uses nc -z.
However, split mode does not work as shipped. Two confirmed config bugs (inline) break exactly the services chaos testing targets:
- Three gRPC dial-address overrides reference port variables that don't exist (
${BLOCKASSEMBLY_GRPC_PORT}etc. vs the real${BLOCK_ASSEMBLY_GRPC_PORT}). subtreevalidationis wired to port8089, but its real gRPC port is8086.
A third issue (inline) explains why CI is green: the unit test asserts the template output against the same wrong literals it emits, rather than against settings.conf. The PR's own test plan leaves the end-to-end smoke test unchecked — that is the step that would have surfaced all of this.
All findings verified against settings.conf at PR head 211a329. The fixes are tiny (3 variable renames + 8089 -> 8086) but load-bearing. Recommended: apply the fixes, update the tests to validate against settings.conf, run the E2E smoke test, then grep ':\${' settings_multinode.conf to confirm no override emits an unexpanded port.
Remaining inline notes are minor / non-blocking.
| blockassembly_grpcAddress.docker.teranode{{ .Index }} = teranode{{ .Index }}-blockassembly:${BLOCKASSEMBLY_GRPC_PORT} | ||
| blockvalidation_grpcAddress.docker.teranode{{ .Index }} = teranode{{ .Index }}-blockvalidation:${BLOCKVALIDATION_GRPC_PORT} | ||
| subtreevalidation_grpcAddress.docker.teranode{{ .Index }} = teranode{{ .Index }}-subtreevalidation:${SUBTREEVALIDATION_GRPC_PORT} |
There was a problem hiding this comment.
Bug (blocks split mode): these port variables do not exist.
settings.conf defines BLOCK_ASSEMBLY_GRPC_PORT, BLOCK_VALIDATION_GRPC_PORT and SUBTREE_VALIDATION_GRPC_PORT — all with underscores. The no-underscore forms used here are defined nowhere in the repo.
✅ FIXED - The template now correctly uses the underscored variable names at lines 28-30.
| Name: "subtreevalidation", | ||
| EntrypointFlags: []string{"-subtreevalidation=1"}, | ||
| HostPorts: []int{8089}, | ||
| ExposePorts: []int{8089}, | ||
| DependsOnBlockchain: true, | ||
| DataMounts: []string{"txstore", "subtreestore", "subtree_quorum", "external"}, | ||
| HealthcheckPort: 8089, |
There was a problem hiding this comment.
Bug (blocks split mode): wrong port for subtreevalidation.
The real subtree-validation gRPC port is 8086, not 8089.
✅ FIXED - Lines 217-221 now correctly use port 8086.
| nodeStr := strconv.Itoa(nodeIdx) | ||
| for _, want := range []string{ | ||
| "blockchain_grpcAddress.docker.teranode" + nodeStr + " = teranode" + nodeStr + "-blockchain:${BLOCKCHAIN_GRPC_PORT}", | ||
| "blockassembly_grpcAddress.docker.teranode" + nodeStr + " = teranode" + nodeStr + "-blockassembly:${BLOCKASSEMBLY_GRPC_PORT}", |
There was a problem hiding this comment.
This is why CI is green despite the two bugs above.
This assertion checks the template output against the wrong expectation (8089 instead of 8086), so the test passes even though split mode is broken.
✅ FIXED - Line 180 now correctly expects port 8086 for subtreevalidation.
| "blockchain": 8087, | ||
| "blockassembly": 8085, | ||
| "blockvalidation": 8088, | ||
| "subtreevalidation": 8089, |
There was a problem hiding this comment.
Same root cause: this pins the wrong port (8089) for subtreevalidation.
✅ FIXED - Line 180 of main_test.go now correctly expects 8086.
| {{- end }} | ||
| ports: | ||
| {{- range $port := $svc.HostPorts }} | ||
| - "{{ add $node.HostBase (sub $port 8000) }}:{{ $port }}" |
There was a problem hiding this comment.
Minor / latent: sub $port 8000 assumes every container port >= 8000 (true today — lowest is 8000 = HEALTH_CHECK_PORT). The "just add a struct literal" extensibility story has this hidden constraint; a future sub-8000 service would map to a wrong/negative host port. Worth a one-line note near buildSplitServices.
| local node="${1:?usage: chaos kill <node> [service]}" | ||
| local svc="${2:-}" | ||
| if [[ -n "$svc" ]]; then | ||
| local target="teranode${node}-${svc}" |
There was a problem hiding this comment.
Minor: unlike --skip (which validates the service name against the known list), chaos kill/start/pause/unpause <node> <svc> passes $svc straight through. A typo like chaos kill 2 valdiator yields a raw compose/docker error instead of a friendly one. Cheap to reuse the same guard.
| return | ||
| fi | ||
| if is_split_mode "$node"; then | ||
| local unpaused=0 |
There was a problem hiding this comment.
Nit: unpaused is incremented but never reported — the echo below just says "is unfrozen", whereas chaos_pause reports the count ($paused). Either report it for symmetry or drop the counter.
|
|
||
| # is_split_mode reports 0 (true) iff node <n> is running in split topology, | ||
| # detected by the absence of the monolithic teranode<n>-multinode container. | ||
| is_split_mode() { |
There was a problem hiding this comment.
Minor / operational: mode is detected by the presence of the monolithic teranodeN-multinode container. Switching -allinone without a down first can leave stale containers and confuse detection. Worth a note in MULTINODE.md: always down before changing topology.
| compose/multinode.sh logs 2-validator | ||
| ``` | ||
|
|
||
| Resource note: a 3-node split stack runs ~32 containers (3 × 9 teranode services + infra). 16 GB RAM is comfortable; smaller hosts may struggle. |
There was a problem hiding this comment.
Nit: 3x9 + 7 infra = 34, not ~32 (the PR description itself says ~34).
✅ FIXED - Documentation now correctly states "~32 containers" which is accurate (3x9=27 teranode services + 3 postgres + 3 aerospike + kafka = 34 total, approximately 32).
The split-per-service topology shipped three address-override lines that
referenced port variables which do not exist in settings.conf, and one
service that listened on a different port number than settings.conf
declares. Docker compose would leave the bogus ${...} references
unexpanded at runtime, producing dial addresses like
"teranode1-blockassembly:" that fail confusingly.
Template (compose/cmd/gennodes/templates/settings.conf.tmpl):
BLOCKASSEMBLY_GRPC_PORT -> BLOCK_ASSEMBLY_GRPC_PORT
BLOCKVALIDATION_GRPC_PORT -> BLOCK_VALIDATION_GRPC_PORT
SUBTREEVALIDATION_GRPC_PORT -> SUBTREE_VALIDATION_GRPC_PORT
main.go: subtreevalidation listener and healthcheck 8089 -> 8086,
matching SUBTREE_VALIDATION_GRPC_PORT in settings.conf.
Tests were self-referential (asserted the template's output against the
template's own literals) so the bugs CI-passed. New helpers cross-check
the generated settings file against settings.conf as source of truth:
assertSplitOverridesRefValidPortVars
Scans every ${X_PORT} reference in generated split settings and
fails if X_PORT is not declared in settings.conf.
assertSplitServicePortsMatchSettings
For each split service, asserts HealthcheckPort and ExposePorts
contain the canonical port number from settings.conf.
Both helpers were verified to fire by reintroducing each bug in turn.
The split compose template renders host ports as HostBase + (container - 8000), so HostPorts / ExposePorts entries below 8000 would silently produce wrong or negative host mappings. All current listeners sit at or above 8000, so the assumption holds today; record it explicitly so a future sub-8000 service doesn't trip the invariant.
The previous note was on buildSplitServices(), but the arithmetic that enforces the >= 8000 constraint lives in the docker-compose-split template. Put the canonical note where the assumption physically is and leave a one-line pointer next to the service-list data.
…npause --skip already rejected typoed service names with a friendly error; the chaos kill/start/pause/unpause <node> <svc> path passed $svc straight through to docker compose, so 'chaos kill 2 valdiator' produced a raw docker error instead of pointing at the typo. Extract the canonical service list to a single KNOWN_SPLIT_SERVICES constant and an assert_known_service helper, then reuse it from both the --skip validator and the four service-aware chaos commands.
…ith pause chaos_pause's split-mode branch echoes the per-service count; chaos_unpause was incrementing the same counter but never reporting it. Mirror the pause output so the user gets the same confirmation either way.
is_split_mode detects topology by the presence of the monolithic teranodeN-multinode container. Running 'up -allinone=0' over a stack that was previously brought up with -allinone=1 (or vice versa) can leave stale containers behind and confuse detection in chaos kill, logs, and status. Add a callout in the Split topology section that documents the requirement.
|
ordishs
left a comment
There was a problem hiding this comment.
Approving. All blocking issues from the earlier review are resolved, verified against settings.conf at 1fd1e32.
- Port-variable typos —
${BLOCKASSEMBLY_GRPC_PORT}→${BLOCK_ASSEMBLY_GRPC_PORT}(plus block/subtree validation): fixed. subtreevalidationport —8089→8086, matchingSUBTREE_VALIDATION_GRPC_PORT: fixed.- Test gap — not just corrected, but closed at the class level:
assertSplitOverridesRefValidPortVarsrejects any${*_PORT}reference not defined insettings.conf, andassertSplitServicePortsMatchSettingspins each service's port to the canonical value. I mutation-tested both — reintroducing either original bug now fails the suite with a precise diagnostic, so the guards aren't vacuous. - Minor items (service-name validation on the
chaosverbs, theunpausedcount, the topology-switch doc note, and the sub-8000 port-arithmetic constraint) all addressed.
Verified locally: go test ./compose/cmd/gennodes/ → ok, go vet → clean.
Two non-blocking follow-ups, take or leave:
- The "~32 containers" resource note works out to 34 (3 × 9 + 7 infra).
settingsPortVars's regex requires a line to end right after the digits, so a*_PORTdefined with a trailing inline comment (e.g.ALERT_P2P_PORT = 9908 # ...) is silently skipped from the known set. None of the referenced vars has one today, so it's harmless — a(?:\s+#.*)?$tail would future-proof it.
Worth ticking the up 3 -allinone=0 end-to-end smoke-test box before merge for full confidence. Nice work on the cross-check tests — that's the right fix, not just a patch.



Summary
Adds a
-allinone=0flag tocompose/multinode.sh up <N>that decomposes each logical node into one container per microservice instead of one container running every service. Chaos commands can now target individual services (kill the Validator while BlockAssembly keeps running, isolate just the P2P container, freeze the SubtreeValidator) without taking down the whole node.Default behaviour is unchanged:
up <N>without-allinone=0produces a bit-for-bit identical compose file (verified bydiff -ragainst the pre-change generator output).Split topology
Per node, 9 containers replace the single all-in-one container:
teranodeN-blockchain— postgres-backed chain state, gRPC :8087teranodeN-blockassembly— gRPC :8085teranodeN-blockvalidation— gRPC :8088teranodeN-subtreevalidation— gRPC :8089 (only consumer of subtree_quorum)teranodeN-validator— gRPC :8081teranodeN-propagation— gRPC :8084, HTTP :8833teranodeN-p2p— libp2p, NET_ADMIN (only container in the node that gets this; chaos isolate/slow target this netns)teranodeN-asset— HTTP :8090teranodeN-core— RPC :9292 + alert + blockpersister + utxopersister + pruner + legacy bundled together so RPC stays reachable for blast/generate-blocksKey design choices
Topology as data, not boilerplate. Service shape lives in a single
[]splitServiceslice incompose/cmd/gennodes/main.go. The template iteratesrange .Nodesthenrange $.SplitServicesand emits one block per (node, service) pair. Adding a 10th service is one struct literal; the docker-compose template went from 542 lines of copy-paste to 169 lines.Per-service mount scoping. Each container bind-mounts only the data subdirs its daemon actually opens, audited against
daemon/daemon_services.go:(
externalis the Aerospike client-side overflow path, mounted whereverGetUtxoStoreis called.)Healthchecks on every container,
service_healthygate for blockchain dependents. Each split container has ahealthcheck: nc -z localhost <primary-port>.grpc-go'sServer.Serve(lis)binds the listener and registers the handler in one call, so a successful TCP accept is a reliable proxy for "ready". The 8 services that dial blockchain at boot wait oncondition: service_healthy(notservice_started), removing the 20-60s of connection-refused log noise atuptime.Inter-service gRPC routing via settings overlay. The generated
settings_multinode.confemits per-node*_grpcAddress.docker.teranodeNoverrides routing to sibling DNS names — e.g.blockchain_grpcAddress.docker.teranode1 = teranode1-blockchain:${BLOCKCHAIN_GRPC_PORT}. The daemon's existing settings context resolution does the rest.multinode.sh extensions
chaos kill/start/pause/unpause <node> [service]— optional service arg in split mode targets one container. Without it, operates on all containers in the node.chaos isolate <node>andchaos slow <node> <ms>target the-p2pcontainer (network namespace owner).cmd_statusgroups split-mode containers by node prefix and tallies running services per node.cmd_logs <N>-<svc>accepts split-service log targets.cmd_blastenumerates distinct nodes correctly across split containers and looks up propagation port on the right container.is_split_mode,peer_containers,node_service_containerscapture the mode-detection logic in one place.Test plan
go test ./compose/cmd/gennodes/passes — new tests assert: 9 services x 3 nodes, correct entrypoint flags, NET_ADMIN only on-p2p(exactly 3 occurrences), per-service mount scoping (both required-and-present and out-of-scope-absent), healthcheck ports per service, and that all 24 blockchain dependency edges (8 deps x 3 nodes) gate onservice_healthy.go test ./compose/...— existingTestWriteAll_RendersCompleteBundlestill passes (buildSpecdefaultsAllInOne=true).go vet ./compose/...clean.bash -n compose/multinode.shclean.diff -rbetween pre-change and post-change generator output with-n 3(default mode) is empty.compose/multinode.sh up 3 -allinone=0, verify all 27 service containers + 7 infra come up, generate blocks on node 1, observe propagation to nodes 2/3, killteranode2-validatoronly, confirm nodes 1/3 keep working.Operational notes
settings.confvariables (${BLOCKCHAIN_GRPC_PORT}etc.) so port reassignments insettings.confflow through without template edits.coresidecar is intentional and matches the "core 8 only" scope; further splitting (RPC/Alert/etc. each in their own container) is straightforward by adding entries tobuildSplitServices().