Skip to content

fix: path persistence + proactive path resolution#664

Merged
torlando-tech merged 9 commits intorelease/v0.9.xfrom
fix/path-persistence-proactive-resolution
Mar 12, 2026
Merged

fix: path persistence + proactive path resolution#664
torlando-tech merged 9 commits intorelease/v0.9.xfrom
fix/path-persistence-proactive-resolution

Conversation

@torlando-tech
Copy link
Copy Markdown
Owner

@torlando-tech torlando-tech commented Mar 11, 2026

Summary

  • Fix path persistence: shutdown() now calls RNS.Transport.persist_data() before clearing global state, so paths survive graceful shutdowns
  • Fix stale BLE path cleanup: Only clears timestamp=0 paths (the actual Reticulum bug), no longer nukes valid paths from prior sessions
  • Add periodic persistence: Transport data saved every 15 minutes via IdentityResolutionManager for crash resilience
  • Add proactive path requests: Startup sweep, conversation-open, and contact-add triggers ensure paths are requested immediately rather than waiting for the 15-minute background loop

Manual Testing Checklist

Path Persistence (Part A)

  • Install app, start a conversation with a known contact (path established)
  • Close app normally (back button or swipe away)
  • Restart app — check logcat for IdentityResolutionMgr startup sweep: paths should already exist (skipping messages)
  • Send a message immediately after restart — should succeed without delay

Crash Resilience (Part B)

  • Install app, chat with someone, wait ~1 minute
  • Force-kill app from Settings > Apps > Force Stop
  • Restart app — check logcat for startup sweep path requests (2s intervals)
  • Verify persistTransportData calls appear in logcat every 15 minutes

Stale Path Fix (Part A2)

  • Check logcat during startup for _clear_stale_ble_paths — should only clear timestamp=0 entries, not age-based removals

Proactive Path Requests — Contact Addition (Part C)

  • Add a hash-only contact → check logcat for immediate path request (IdentityResolutionMgr: Requesting path for...)
  • Add a full-identity contact (lxma:// URL) → check logcat for path request
  • Add contact from announce stream (star) → check logcat for path request
  • Save a conversation peer to contacts → check logcat for path request

Proactive Path Requests — Conversation Open (Part C3)

  • Open a conversation when path doesn't exist → check logcat for Requesting path for conversation...
  • Open a conversation when path exists → should NOT see path request (skipped)

Retry Resolution (Part C2)

  • Add a hash-only contact that can't be resolved
  • Tap "Retry" on the unresolved contact → check logcat for path request via retryResolution

Regression Check

  • All existing unit tests pass (./gradlew :app:testNoSentryDebugUnitTest)
  • Normal messaging workflow still works (send/receive messages)
  • BLE interface connections still work
  • App doesn't crash on startup

🤖 Generated with Claude Code

Fix three compounding bugs that destroyed Reticulum's path table on every
restart, causing messages to fail until paths were rediscovered (~15 min):

1. shutdown() cleared destination_table before persisting — now calls
   RNS.Transport.persist_data() before clearing global state
2. _clear_stale_ble_paths() was too aggressive, removing all BLE paths
   >60s old at startup — now only clears timestamp=0 (the actual bug)
3. No crash resilience — added periodic persist_transport_data() every
   15 minutes via IdentityResolutionManager

Additionally adds proactive path requests as a safety net:
- Startup sweep requests paths for all contacts (2s stagger)
- Path requested on conversation open if missing
- Path requested when adding contacts (hash-only, full-identity,
  announce, save-from-conversation)
- Retry resolution delegates to IdentityResolutionManager

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR addresses two related reliability gaps in Reticulum path management: paths were not being saved on graceful shutdown (or periodically for crash resilience), and the BLE path cleanup was too aggressive — removing valid paths from prior sessions based on age rather than only the timestamp=0 bug entries. It also wires up proactive path requests at every user-facing entry point (contact add, conversation open, retry, announce-stream star) so paths are populated immediately rather than waiting for the background 15-minute loop.

Key changes:

  • Shutdown persistence: reticulum_wrapper.py now calls RNS.Transport.persist_data() in Step 3.5 of shutdown(), before global state is cleared.
  • Periodic crash-resilience: IdentityResolutionManager.checkPendingContacts() now calls persistTransportData() at the end of every 15-minute cycle, piggybacking persistence on the existing background loop.
  • Startup sweep: A one-shot coroutine (startupSweepJob) fires 5 seconds after start(), iterates all ACTIVE + PENDING contacts, and issues path requests (with hasPath guard and 2-second stagger) for any without a known path.
  • Proactive path requests: Four ViewModels (ContactsViewModel, ChatsViewModel, AnnounceStreamViewModel, MessagingViewModel) now call identityResolutionManager.requestPathForContact() at the appropriate lifecycle points, all guarded by hasPath to avoid unnecessary network traffic.
  • Stale BLE path fix: _clear_stale_ble_paths now only removes entries with timestamp=0, preserving valid paths from prior sessions.
  • Previous review comments addressed: RETICULUM_AVAILABLE guard added to persist_transport_data; RoutingManager.persistTransportData() now inspects and logs the Python result; double resetContactForRetry call eliminated.

Confidence Score: 4/5

  • Safe to merge with one minor re-entrance edge case in IdentityResolutionManager.start().
  • All previous review comments have been properly addressed. The implementation is well-structured with consistent patterns across ViewModels, proper error handling throughout, and the Python wrapper changes are guarded correctly. The one remaining issue — startupSweepJob not being included in the start() re-entrance guard — is an edge case that requires an abnormal resolutionJob termination without stop() being called, which is unlikely under normal lifecycle management.
  • app/src/main/java/com/lxmf/messenger/service/IdentityResolutionManager.kt — review the start() re-entrance guard to ensure startupSweepJob cannot be double-launched.

Important Files Changed

Filename Overview
app/src/main/java/com/lxmf/messenger/service/IdentityResolutionManager.kt Central file for this PR. Adds periodic transport persistence, one-shot startup sweep, and requestPathForContact. Minor re-entrance guard gap: startupSweepJob is not checked alongside resolutionJob in start().
python/reticulum_wrapper.py Adds persist_transport_data() with proper RETICULUM_AVAILABLE guard; fixes stale BLE path cleanup to only remove timestamp=0 entries; adds shutdown-time persistence before state clear. All changes are clean.
app/src/main/java/com/lxmf/messenger/service/manager/RoutingManager.kt Adds persistTransportData() with Chaquopy result inspection and warning logging on failure. Well-implemented.
app/src/main/java/com/lxmf/messenger/viewmodel/ContactsViewModel.kt Correctly wires proactive path requests for all three add-contact flows. Previous double-resetContactForRetry issue is resolved — reset stays in ViewModel, retryResolution only fires the path request.
app/src/main/java/com/lxmf/messenger/viewmodel/MessagingViewModel.kt Adds proactive path request on conversation open via identityResolutionManager.requestPathForContact. Clean, non-blocking IO dispatch.
app/src/main/java/com/lxmf/messenger/viewmodel/AnnounceStreamViewModel.kt Uses identityResolutionManager.requestPathForContact (with hasPath guard) when starring an announce entry, consistent with other ViewModels.
app/src/main/java/com/lxmf/messenger/viewmodel/ChatsViewModel.kt Adds path request when saving a conversation peer to contacts. Correctly delegates to identityResolutionManager.

Sequence Diagram

sequenceDiagram
    participant App as App Lifecycle
    participant IRM as IdentityResolutionManager
    participant VM as ViewModel (Contacts/Chats/Messaging/Announce)
    participant RNS as ReticulumProtocol
    participant PY as reticulum_wrapper.py

    App->>IRM: start(scope)
    IRM->>IRM: launch resolutionJob (every 15 min)
    IRM->>IRM: launch startupSweepJob (5s delay)

    Note over IRM: Startup sweep (t+5s)
    IRM->>RNS: hasPath(destHash) for each contact
    alt path missing
        IRM->>RNS: requestPath(destHash)
        IRM->>IRM: delay 2s (stagger)
    end

    Note over IRM: Periodic loop (every 15 min)
    IRM->>RNS: recallIdentity / requestPath for PENDING contacts
    IRM->>RNS: persistTransportData()
    RNS->>PY: persist_transport_data()
    PY->>PY: RNS.Transport.persist_data()

    Note over VM: User action (add contact / open conversation)
    VM->>IRM: requestPathForContact(destinationHash)
    IRM->>RNS: hasPath(destHash)
    alt no path
        IRM->>RNS: requestPath(destHash)
    end

    Note over App: Graceful shutdown
    App->>PY: shutdown()
    PY->>PY: RNS.Transport.persist_data() [Step 3.5]
    PY->>PY: Clear RNS global state
Loading

Comments Outside Diff (1)

  1. app/src/main/java/com/lxmf/messenger/service/IdentityResolutionManager.kt, line 57-81 (link)

    startupSweepJob not guarded by re-entrance check

    The re-entrance guard at line 57 only checks resolutionJob?.isActive. If the loop job has ended abnormally (e.g. the scope cancelled it externally) while startupSweepJob is still waiting out its 5-second delay, calling start() again will launch a second startupSweepJob without cancelling the first — leading to two concurrent startup sweeps hitting the network.

    A minimal fix is to include the sweep job in the guard:

    Alternatively, explicitly cancel the old sweep job before (re)launching:

    startupSweepJob?.cancel()
    startupSweepJob = scope.launch(Dispatchers.IO) {
        delay(STARTUP_SWEEP_DELAY_MS)
        requestPathsForAllContacts()
    }
Prompt To Fix All With AI
This is a comment left during a code review.
Path: app/src/main/java/com/lxmf/messenger/service/IdentityResolutionManager.kt
Line: 57-81

Comment:
**`startupSweepJob` not guarded by re-entrance check**

The re-entrance guard at line 57 only checks `resolutionJob?.isActive`. If the loop job has ended abnormally (e.g. the `scope` cancelled it externally) while `startupSweepJob` is still waiting out its 5-second delay, calling `start()` again will launch a second `startupSweepJob` without cancelling the first — leading to two concurrent startup sweeps hitting the network.

A minimal fix is to include the sweep job in the guard:

```suggestion
        if (resolutionJob?.isActive == true || startupSweepJob?.isActive == true) {
            Log.d(TAG, "Resolution manager already running")
            return
        }
```

Alternatively, explicitly cancel the old sweep job before (re)launching:

```kotlin
startupSweepJob?.cancel()
startupSweepJob = scope.launch(Dispatchers.IO) {
    delay(STARTUP_SWEEP_DELAY_MS)
    requestPathsForAllContacts()
}
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 9ef3453

…olutionManager

Detekt's NoRelaxedMocks rule forbids mockk(relaxed = true). Replace with
named mocks and explicit coEvery stubs for requestPathForContact and
retryResolution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sentry
Copy link
Copy Markdown
Contributor

sentry bot commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
python/reticulum_wrapper.py 50.00% 8 Missing ⚠️

📢 Thoughts on this report? Let us know!

torlando-tech and others added 2 commits March 11, 2026 17:14
- AnnounceStreamViewModel: delegate to identityResolutionManager.requestPathForContact()
  instead of raw reticulumProtocol.requestPath(), gaining the hasPath guard
- reticulum_wrapper.py: add RETICULUM_AVAILABLE guard to persist_transport_data()
- IdentityResolutionManager: move delay outside try block in requestPathsForAllContacts()
  so stagger always applies even when requestPath throws

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…odelTest

Detekt's NoRelaxedMocks rule. Use named identityResolutionManager mock
with coEvery stub for requestPathForContact.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

ContactsViewModel.retryIdentityResolution() already calls
resetContactForRetry() and checks the result before delegating to
IdentityResolutionManager.retryResolution(). The second call inside
retryResolution() was redundant. Remove it and document the caller
contract.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The one-shot startup sweep coroutine was launched without storing
its Job, so stop() couldn't cancel it. During Reticulum restarts,
the sweep would continue firing path requests against a shutting-down
service.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

Replace inline hasPath/requestPath logic with delegation to
IdentityResolutionManager.requestPathForContact(), consistent
with ContactsViewModel, ChatsViewModel, and AnnounceStreamViewModel.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

- ContactsViewModel.retryIdentityResolution: add Dispatchers.IO to
  avoid blocking AIDL call on main thread (potential ANR)
- RoutingManager.persistTransportData: inspect Python return dict
  so soft failures appear in logcat

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

Use identityInput.destinationHash consistently across all contact
addition branches, and add missing path request when a contact is
resolved immediately from the announce cache.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

@torlando-tech torlando-tech merged commit 8f30c12 into release/v0.9.x Mar 12, 2026
22 of 25 checks passed
@torlando-tech torlando-tech deleted the fix/path-persistence-proactive-resolution branch March 12, 2026 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant