test(data): de-flake DeviceLinkRepositoryImplTest by running on the wall clock#5883
Merged
Conversation
…all clock reconcilePrunesShortCodesNoLongerInCatalog flaked in CI. reconcile() guards its network fetch with withTimeoutOrNull, whose 5s deadline is measured against the calling coroutine's clock. The test drove that clock with runTest/UnconfinedTestDispatcher (virtual time) while Room executes its DB work on the real ioDispatcher, which runTest neither owns nor awaits. Whenever the coroutine parked on a real IO thread the test scheduler idled and runTest fast-forwarded virtual time; under load it jumped past the 5s budget, so the fetch was reported as timed out, store() was skipped, and the cache kept the pre-prune rows [a, b] instead of [a]. A 24-thread stress harness reproduced this at ~0.3%/run, and the failure count exactly matched the "network refresh timed out" log count. Switching the test to real dispatchers + runBlocking, so the timeout is measured on the wall clock as in production, drove ~18k concurrent runs to zero failures. Also corrects the configureCommon KDoc: Room already forces a single connection for in-memory databases regardless of the pool config, so multiConnection=false is not what makes this test deterministic. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
jamesarich
added a commit
to jeremiah-k/Meshtastic-Android
that referenced
this pull request
Jun 20, 2026
Brings the branch up to date with main and resolves the lone conflict in DeviceLinkRepositoryImplTest.kt by taking main's wall-clock harness (runBlocking + Dispatchers.Unconfined, from meshtastic#5883). The branch's runTest(UnconfinedTestDispatcher) variant would reintroduce the Room ioDispatcher virtual-clock flake meshtastic#5883 fixed; the test set is otherwise identical, so no coverage is lost. The refresh-timeout / NonCancellable production changes merge cleanly and are unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
DeviceLinkRepositoryImplTest.reconcilePrunesShortCodesNoLongerInCatalogfails intermittently in CI. It is a test-harness flake, not a product bug — the repository logic is correct; the test mixedrunTest's virtual clock with Room's real dispatcher.Root cause
reconcile()guards its network fetch withwithTimeoutOrNull(5s), whose deadline is measured against the calling coroutine's clock. The test ran onrunTest+UnconfinedTestDispatcher(virtual time), while Room executes its DB work on the realioDispatcher— whichrunTestneither owns nor awaits. Each time the coroutine parked on a real IO thread, the test scheduler went idle andrunTestfast-forwarded virtual time; under load it jumped past the 5s budget, so the fetch was reported as timed out,store()was skipped, and the cache kept the pre-prune rows[a, b]instead of[a].This is purely a test artifact: in production the timeout runs on the wall clock against a real network call, so it never misfires.
Changes
🐛 Bug Fixes
DeviceLinkRepositoryImplTeston real dispatchers +runBlockinginstead ofUnconfinedTestDispatcher+runTest, soreconcile()'swithTimeoutOrNullis measured on the wall clock (matching production) andrunTest's virtual-clock fast-forward can no longer spuriously expire it.🧹 Chores
configureCommonKDoc: clarify that Room 3 always serves in-memory databases from a single connection regardless of the pool config, and that this test's determinism comes from the wall clock, not frommultiConnection.Testing Performed
mainit failed ~0.3% of runs, and the failure count exactly matched thenetwork refresh timed outlog count — confirming the timeout-misfire mechanism. With this change, ~18,000 concurrent + ~15,000 sequential runs completed with zero failures,timedOut=0. The harness was removed before this PR../gradlew spotlessApply spotlessCheck detekt assembleDebug test allTests— BUILD SUCCESSFUL; all 6DeviceLinkRepositoryImplTestmethods pass.