Skip to content

fix: eliminate BLE keepalive coroutine accumulation (COLUMBA-40)#530

Merged
torlando-tech merged 1 commit intomainfrom
fix/ble-keepalive-oom
Feb 23, 2026
Merged

fix: eliminate BLE keepalive coroutine accumulation (COLUMBA-40)#530
torlando-tech merged 1 commit intomainfrom
fix/ble-keepalive-oom

Conversation

@torlando-tech
Copy link
Copy Markdown
Owner

Summary

  • Convert startKeepalive (client) and startPeripheralKeepalive (server) from fire-and-forget scope.launch wrappers to suspend functions — all call sites are already in suspend contexts
  • Replace untracked scope.launch { disconnect(address) } with inline disconnect() call, nulling keepaliveJob first to prevent self-cancellation
  • Add connection-exists guard after delay() to stop keepalives for already-disconnected devices

Fixes COLUMBA-40: OutOfMemoryError at 512MB heap from unbounded coroutine accumulation under sustained BLE churn (Samsung Galaxy Z Fold 5, Android 16).

Test plan

  • ./gradlew :reticulum:compileDebugKotlin — builds clean
  • ./gradlew :reticulum:testDebugUnitTest --tests "*.BleGattClient*" — all tests pass
  • Detekt passes (pre-push hook)
  • Manual: Connect 3+ BLE peers, idle 10min, force keepalive failures by moving out of range, confirm clean disconnect without heap growth

🤖 Generated with Claude Code

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Feb 23, 2026

Greptile Summary

Fixes OOM by eliminating unbounded coroutine accumulation in BLE keepalive mechanisms. Converted startKeepalive and startPeripheralKeepalive from fire-and-forget scope.launch wrappers to suspend functions, preventing coroutine leaks during sustained BLE churn.

  • Client-side (BleGattClient.kt): Added connection-exists guard after delay() at line 1146-1154 to stop keepalives cleanly when connections are removed by other code paths
  • Client-side (BleGattClient.kt): Replaced nested scope.launch { disconnect(address) } with inline disconnect() call at line 1188-1192, nulling keepaliveJob first to prevent self-cancellation
  • Server-side (BleGattServer.kt): Added identical connection-exists guard at line 1026-1034 for peripheral keepalive
  • Style changes: Converted several single-expression functions to expression-body syntax (lines 384, 424, 429, 1100-1107 in client; lines 445, 450, 500, 505, 978-985 in server)

The fix addresses COLUMBA-40 where 512MB heap exhaustion occurred on Samsung Galaxy Z Fold 5 under sustained BLE connection churn. All call sites verified to be in suspend contexts.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The changes are well-scoped to fix a specific memory leak without altering core logic. All call sites have been verified to be in suspend contexts, the conversion from fire-and-forget to suspend functions is mechanically sound, and the early-exit guards after delay() prevent keepalives from continuing for disconnected devices. The inline disconnect approach correctly nulls the keepalive job before calling disconnect to prevent self-cancellation. Style changes are cosmetic and improve readability.
  • No files require special attention

Important Files Changed

Filename Overview
reticulum/src/main/java/com/lxmf/messenger/reticulum/ble/client/BleGattClient.kt Converted startKeepalive from fire-and-forget to suspend function, added connection-exists guard after delay, replaced nested scope.launch disconnect with inline call
reticulum/src/main/java/com/lxmf/messenger/reticulum/ble/server/BleGattServer.kt Converted startPeripheralKeepalive from fire-and-forget to suspend function, added connection-exists guard after delay

Last reviewed commit: 9ca6185

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile


// Continue with regular interval
while (isActive) {
delay(BleConstants.CONNECTION_KEEPALIVE_INTERVAL_MS)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing early-exit guard after delay (like client keepalive at BleGattClient.kt:1146-1154) to prevent keepalive from continuing after disconnect

Suggested change
delay(BleConstants.CONNECTION_KEEPALIVE_INTERVAL_MS)
delay(BleConstants.CONNECTION_KEEPALIVE_INTERVAL_MS)
// Early exit if connection was removed by another code path
val stillConnected = centralsMutex.withLock {
connectedCentrals.containsKey(address)
}
if (!stillConnected) {
Log.d(TAG, "Keepalive stopping for $address: connection no longer exists")
return@launch
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: reticulum/src/main/java/com/lxmf/messenger/reticulum/ble/server/BleGattServer.kt
Line: 1024

Comment:
missing early-exit guard after delay (like client keepalive at BleGattClient.kt:1146-1154) to prevent keepalive from continuing after disconnect

```suggestion
                        delay(BleConstants.CONNECTION_KEEPALIVE_INTERVAL_MS)

                        // Early exit if connection was removed by another code path
                        val stillConnected = centralsMutex.withLock {
                            connectedCentrals.containsKey(address)
                        }
                        if (!stillConnected) {
                            Log.d(TAG, "Keepalive stopping for $address: connection no longer exists")
                            return@launch
                        }
```

How can I resolve this? If you propose a fix, please make it concise.

@sentry
Copy link
Copy Markdown
Contributor

sentry bot commented Feb 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

… OOM

Convert startKeepalive and startPeripheralKeepalive from fire-and-forget
scope.launch wrappers to suspend functions, since all call sites are
already in suspend contexts. Replace the untracked disconnect coroutine
with an inline call (nulling keepaliveJob first to prevent self-
cancellation). Add connection-exists guard after delay() to stop
keepalives for already-disconnected devices.

Fixes COLUMBA-40: OutOfMemoryError from coroutine accumulation under
sustained BLE churn on Samsung Galaxy Z Fold 5.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

@torlando-tech torlando-tech merged commit 04a8479 into main Feb 23, 2026
23 of 25 checks passed
@torlando-tech torlando-tech deleted the fix/ble-keepalive-oom branch February 23, 2026 05:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant