fix(ble): unblock reconnect + kable audit (logging, priority, backoff, StateFlow)#5222
Merged
Conversation
…agged - Introduce BleLoggingConfig (BleLogLevel + BleLogFormat) decoupled from Kable's API surface so Koin/test classpaths don't need Kable types. - Provide via Koin from BuildConfigProvider.isDebug: Events in debug, Warnings in release. Format defaults to Compact (single-line) which is much friendlier for adb logcat / grep than Kable's default Multiline. - Inject the config into KableBleConnectionFactory and KableBleScanner instead of the previously hard-coded Events/Multiline pair. - Tag the Android MTU negotiation log with the device address so multi-device sessions are unambiguous. - KoinVerificationTest: declare the wrapping enums as known extra types. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
Kable optimization audit improvements (Phase A+B of full audit). Connection quality: - Request high connection priority (CONNECTION_PRIORITY_HIGH) right after the MTU exchange — meaningful throughput uplift on Android. - Document the rationale for the LE / 1M PHY default in KablePlatformSetup so future tuners know the constraints. Retry behavior: - BleRetry now does exponential backoff (factor 2, cap 2000ms) with ±25% jitter; default initial delay lowered 500ms→250ms. Avoids thundering-herd reconnects after firmware drops. Reactive surface: - BleConnection.deviceFlow / connectionState are now StateFlow, with distinct-equals semantics. Subscribers get the current value immediately on collect — no more missed-edge races on connect. - FakeBleConnection / CancellingProfileBleConnection updated to match. Profile / write path: - Cache toRadioWriteType once per service instead of recomputing per packet; halves the property reads on the hot send path. - Latch triggerDrain to capacity=1 (was 64) — drain is idempotent so one pending tick is sufficient. - Log swallowed logRadio errors at debug instead of silently dropping. Misc: - Clarify the autoConnect 'giving up' log message so it's clear we fall back to scan-and-connect on Android's autoConnect ceiling. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When firmware drops the link, BleRadioTransport's outer loop calls
connectAndAwait() again with the same address. Previously, every call
recreated the Kable Peripheral — tearing down its per-peripheral
coroutine scope, Bluetooth-state observer, and (on Android) the
underlying BluetoothGatt — only to immediately rebuild it.
Now we reuse the existing peripheral when:
- we already own one,
- the caller didn't supply a fresh advertisement (a fresh adv carries
scan timing/RSSI that we'd rather feed into a new peripheral), and
- the address matches the cached peripheralAddress.
The state observer job is left running in this path since the peripheral
keeps emitting through it; we only restart the observer when we install
a new peripheral. disconnect() still does a full close (peripheral +
ActiveBleConnection clear + address unset), preserving the existing
contract that BleRadioTransport.close() and the profile-error paths
fully release GATT resources.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- KableBleConnection: when reusing a peripheral, the long-lived stateJob observer kept a reference to the MeshtasticBleDevice instance captured at the previous connect() call. If the bonded-device cache swapped the instance for the same address (e.g. firmware-reported name change), every subsequent state transition was dispatched to the stale device, leaving the new instance's _state StateFlow stuck at Disconnected. Capture the active device in a @volatile field that the observer reads on each emission instead. - BleRetry: drop the hand-rolled pow() shim and use kotlin.math.pow. The shim's justifying comment was wrong — kotlin.math.pow is part of the common stdlib and is already used elsewhere in commonMain (NumberFormatter, LocationUtils, PositionPrecisionPreference). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Manually reverts 2edb98b and the activeDevice field added in 687e1e3 to support it. Real-hardware test: power-cycling the node leaves the reused Kable Peripheral holding a stale BluetoothGatt whose remote is gone, and peripheral.connect() on it fails to recover the link. Kable's SensorTag sample creates a fresh Peripheral per reconnect for exactly this reason — reuse is only safe for cooperative drops, not hard remote teardowns, which is the case we most need to recover from. The other audit wins (high connection priority, exponential backoff with jitter, StateFlow migration, cached writeType, latched triggerDrain, configurable logging) are independent and stay. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
attemptConnection wrapped its disconnect-watcher in
`coroutineScope { onEach{}.launchIn(this); first { Disconnected } }`.
coroutineScope waits for ALL launched children before returning, but a
launched collector on a hot StateFlow never completes naturally. Once
.first returned, the scope hung forever, blocking BleReconnectPolicy
from issuing the next attempt — so a node power-cycle (the canonical
'stable connection then remote drop' path) silently never reconnected.
Replace the coroutineScope wrapper with a plain
`connectionState.filterIsInstance<Disconnected>().first()` and call
onDisconnected() inline. Add a `first { Connected }` gate beforehand
to defeat the StateFlow stale-value race on attempt #2+, where the
previous cycle's Disconnected value would otherwise match immediately.
Add regression test `transport reconnects after a stable connection
is dropped remotely` exercising the full happy-path reconnect cycle,
plus `simulateRemoteDisconnect`/`connectAndAwaitCalls` plumbing on
FakeBleConnection.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
DesktopKoinTest now declares BleLogLevel/BleLogFormat as extraTypes (Koin Verify introspects the BleLoggingConfig data class constructor even though the factory builds it from constants), mirroring the Android verifier. The DFU transport tests' AutoRespondingBleConnection fake now exposes deviceFlow/connectionState as StateFlow to match the tightened BleConnection interface. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI
pushed a commit
that referenced
this pull request
Apr 24, 2026
…, StateFlow) (#5222) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: garthvh <1795163+garthvh@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A focused pass over our Kable BLE integration. Configurable logging, perf/stability wins from a deep audit of
KableBleConnection+KableMeshtasticRadioProfile, and a fix for a long-standing reconnect hang that prevented recovery from node power-cycles.Empirically the handshake (post-connect nodeinfo dump) completes noticeably faster on real hardware, and reconnect now reliably recovers within ~8 s of a remote drop.
Reconnect hang fix (
18d1c0c8d)attemptConnectionwrapped its disconnect-watcher incoroutineScope { connectionState.onEach{...}.launchIn(this); connectionState.first { Disconnected } }.coroutineScopewaits for all launched children before returning, but a launched collector on a hotStateFlownever completes naturally. Once.firstreturned, the scope hung forever, blockingBleReconnectPolicyfrom issuing the next attempt — so a node power-cycle (the canonical "stable connection then remote drop" path) silently never reconnected.Replaced with a plain
connectionState.filterIsInstance<Disconnected>().first()plus an explicitfirst { Connected }gate beforehand to defeat the StateFlow stale-value race on attempt #2+. New regression testtransport reconnects after a stable connection is dropped remotelyexercises the full happy-path reconnect cycle that no existing test covered.Logging (
c22e23be6)BleLoggingConfigwrapper around Kable'sLogging.Level/Logging.Formatso DI consumers don't depend on Kable's API surface.Debug(Events) in debug builds andRelease(Warnings) in release; previously every build emitted Kable's per-event INFO/DEBUG spam.Multiline(5 lines per event) toCompact(one line) — much friendlier foradb logcat/ grep / bug reports.BleLoggingConfig.Debug.copy(level = BleLogLevel.Data).Connection quality (
fe9b77550)requestHighConnectionPriority()right after MTU exchange. Drops the connection interval to ~7.5–15 ms during the nodeinfo burst — usually the single biggest win on handshake latency.BleRetryrewritten: exponential backoff (×2, cap 2 s) with ±25 % jitter; default initial delay 500 ms → 250 ms. Avoids thundering-herd reconnects after firmware drops.KablePlatformSetup.Reactive surface (
fe9b77550)BleConnection.deviceFlowandconnectionStatemigratedSharedFlow→StateFlow. Subscribers get the current value immediately on collect — no more missed-edge races between connect and observer registration.FakeBleConnectionandCancellingProfileBleConnectionupdated to match.Hot-path micro-perf (
fe9b77550)toRadioWriteTypeonce per service instead of recomputing per packet.triggerDrainlatched tocapacity = 1(was 64) — drain is idempotent so one pending tick is sufficient.logRadioerrors now logged at debug instead of silently dropped.Code-review fixes (
687e1e3b9)BleRetry: dropped a hand-rolledpow()shim and usekotlin.math.pow(the shim's "JVM-only" justification was wrong — it's part of the common stdlib and already used elsewhere in commonMain).Reverted
2edb98baa(Peripheral reuse on reconnect) and theactiveDevicefield that supported it (5c744fcc6). Real-hardware testing showed that after a node power-cycle, callingperipheral.connect()on a reused KablePeripheralwhose underlyingBluetoothGattlost its remote does not recover the link. Kable's SensorTag sample creates a freshPeripheralper reconnect for exactly this reason — peripheral reuse is only safe for cooperative drops, not hard remote teardowns, which is the case we most need to recover from.Verification
./gradlew :core:ble:allTests :core:network:allTests :core:testing:allTests✅./gradlew :core:ble:detekt :core:network:detekt :core:testing:detekt✅./gradlew :core:ble:spotlessApply :core:network:spotlessApply :core:testing:spotlessApply✅./gradlew :app:testFdroidDebugUnitTest --tests "*KoinVerificationTest*"✅ (Koin full-graph A3VerifyModule)