Summary
Two unit tests fail intermittently in the test CI job under full-suite load, but pass deterministically in isolation (including -race -count=20 locally). Both assert on batch/iteration counts that appear sensitive to batcher flush timing, which changed recently in #1017 (per-batcher fixed-cadence flushing / SetTickInterval, commit bbb70638b).
These are flaky, not deterministically broken — a re-run of the same commit passed green.
Failing tests
-
services/validator — TestValidateTransactionBatch_DuplicateOutpointCreatesConflicting
Validator_test.go:395: Not equal: expected: 36, actual: 4
-
services/legacy/netsync — TestSyncManager_createUtxos_ChunkFailureCancelsSiblings
handle_block_test.go:1384: "4" is not less than or equal to "1" —
"mergeCtx short-circuit should suppress sibling iterations after a chunk fails; observed 4 post-trigger call(s)."
Where observed
CI test job on PR #1023 — run 26839805772 (DONE 10305 tests, 45 skipped, 2 failures). PR #1023 does not touch services/validator or services/legacy/netsync, and a re-run of the identical commit passed — so the failure is not attributable to that PR.
Reproduction attempts (local)
Both pass in isolation, single run and stressed:
go test ./services/validator/ -run '^TestValidateTransactionBatch_DuplicateOutpointCreatesConflicting$' -count=20 # ok
go test ./services/legacy/netsync/ -run '^TestSyncManager_createUtxos_ChunkFailureCancelsSiblings$' -count=20 -race # ok
The flake only surfaces under the CI runner's concurrent full-suite load, which is consistent with timing/scheduling sensitivity rather than a logic bug.
Suspected cause
Both assertions count emitted/observed items:
- validator expects 36 conflicting registrations but sees 4 — looks like a batch flushed early (fewer items grouped) so most conflicts weren't observed together.
- netsync expects ≤1 post-trigger sibling iteration but sees 4 — the short-circuit raced the in-flight batch.
#1017 changed batcher flushing to a fixed cadence (SetTickInterval). A timing-driven flush boundary would plausibly change how many items land per batch under load, perturbing both count assertions. Worth confirming whether these tests pin the batcher tick / use a deterministic flush trigger rather than relying on wall-clock cadence.
Suggested fix direction
Make the two tests deterministic w.r.t. batch flushing — e.g. drive flushes explicitly (size-1 / manual flush / injected clock) instead of depending on the timer cadence, so they don't depend on CI load. Not a release blocker; it's test flakiness.
Summary
Two unit tests fail intermittently in the
testCI job under full-suite load, but pass deterministically in isolation (including-race -count=20locally). Both assert on batch/iteration counts that appear sensitive to batcher flush timing, which changed recently in #1017 (per-batcher fixed-cadence flushing /SetTickInterval, commitbbb70638b).These are flaky, not deterministically broken — a re-run of the same commit passed green.
Failing tests
services/validator—TestValidateTransactionBatch_DuplicateOutpointCreatesConflictingValidator_test.go:395:Not equal: expected: 36, actual: 4services/legacy/netsync—TestSyncManager_createUtxos_ChunkFailureCancelsSiblingshandle_block_test.go:1384:"4" is not less than or equal to "1"—"mergeCtx short-circuit should suppress sibling iterations after a chunk fails; observed 4 post-trigger call(s)."
Where observed
CI
testjob on PR #1023 — run 26839805772 (DONE 10305 tests, 45 skipped, 2 failures). PR #1023 does not touchservices/validatororservices/legacy/netsync, and a re-run of the identical commit passed — so the failure is not attributable to that PR.Reproduction attempts (local)
Both pass in isolation, single run and stressed:
The flake only surfaces under the CI runner's concurrent full-suite load, which is consistent with timing/scheduling sensitivity rather than a logic bug.
Suspected cause
Both assertions count emitted/observed items:
#1017 changed batcher flushing to a fixed cadence (
SetTickInterval). A timing-driven flush boundary would plausibly change how many items land per batch under load, perturbing both count assertions. Worth confirming whether these tests pin the batcher tick / use a deterministic flush trigger rather than relying on wall-clock cadence.Suggested fix direction
Make the two tests deterministic w.r.t. batch flushing — e.g. drive flushes explicitly (size-1 / manual flush / injected clock) instead of depending on the timer cadence, so they don't depend on CI load. Not a release blocker; it's test flakiness.