Skip to content

fix: path persistence + proactive path resolution#665

Merged
torlando-tech merged 9 commits intomainfrom
fix/path-persistence-proactive-resolution-main
Mar 12, 2026
Merged

fix: path persistence + proactive path resolution#665
torlando-tech merged 9 commits intomainfrom
fix/path-persistence-proactive-resolution-main

Conversation

@torlando-tech
Copy link
Copy Markdown
Owner

Summary

  • Fix path persistence: Save Reticulum's path table before shutdown() clears it, and add periodic persistence via IdentityResolutionManager to survive Android process kills
  • Fix stale BLE path cleanup: Remove overly aggressive 60-second age check that was deleting legitimately persisted paths at startup; only clear timestamp=0 entries (the actual Reticulum bug)
  • Add proactive path resolution: Request paths for all contacts at startup (with 2s stagger), on conversation open, on contact addition (manual, from announce, from save-to-contacts), and on retry

Changes

File Change
python/reticulum_wrapper.py Persist before shutdown clear, fix stale cleanup, add persist_transport_data()
IReticulumService.aidl Add persistTransportData()
RoutingManager.kt Add persistTransportData()
ReticulumServiceBinder.kt Add persistTransportData() override
ReticulumProtocol.kt Add persistTransportData() to interface
ServiceReticulumProtocol.kt Add persistTransportData() impl
MockReticulumProtocol.kt Add no-op persistTransportData()
IdentityResolutionManager.kt Periodic persist, startup sweep, requestPathForContact()
ContactsViewModel.kt Inject manager, fill path request TODOs
MessagingViewModel.kt Path request on conversation open
AnnounceStreamViewModel.kt Path request on contact-from-announce
ChatsViewModel.kt Inject manager, path request on save-to-contacts

Manual Testing Checklist

  • Path persistence on graceful shutdown: Install, chat with someone (path established), close app normally, restart, check logcat for IdentityResolutionMgr — startup sweep should show paths already exist (skipped via hasPath)
  • Path persistence on force-kill: Install, establish paths, force-kill app, restart — paths should survive via periodic persistence (every 15 min cycle)
  • Proactive path requests at startup: Force-kill app, clear path table, restart — should see path requests for contacts at 2s intervals in logcat
  • Path request on contact addition (hash-only): Add a contact by destination hash, check logcat for immediate path request
  • Path request on contact addition (full identity): Add a contact with full identity, check logcat for path request
  • Path request on conversation open: Open a chat with a contact, check logcat for path request if no path exists
  • Path request from announce stream: Add a contact from the announce stream, check logcat for path request
  • Path request from save-to-contacts: Save a conversation peer as a contact, check logcat for path request
  • Retry resolution: Tap retry on a pending contact, verify it triggers path resolution
  • No regression — existing messaging: Send and receive messages normally with established contacts
  • No regression — BLE paths: Connect via BLE, verify paths are not aggressively cleared at startup

🤖 Generated with Claude Code

Fix three compounding bugs that destroyed Reticulum's path table on every
restart, causing messages to fail until paths were rediscovered (~15 min):

1. shutdown() cleared destination_table before persisting — now calls
   RNS.Transport.persist_data() before clearing global state
2. _clear_stale_ble_paths() was too aggressive, removing all BLE paths
   >60s old at startup — now only clears timestamp=0 (the actual bug)
3. No crash resilience — added periodic persist_transport_data() every
   15 minutes via IdentityResolutionManager

Additionally adds proactive path requests as a safety net:
- Startup sweep requests paths for all contacts (2s stagger)
- Path requested on conversation open if missing
- Path requested when adding contacts (hash-only, full-identity,
  announce, save-from-conversation)
- Retry resolution delegates to IdentityResolutionManager

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR addresses two distinct reliability problems with Reticulum path tables and adds proactive path resolution across the app.

Path persistence fix: RNS.Transport.persist_data() is now called before the shutdown teardown that clears in-memory state, closing a race condition where paths were lost on graceful exit. A new persist_transport_data() call is added to the IdentityResolutionManager periodic loop (~every CHECK_INTERVAL_MS) for crash/process-kill resilience, threaded through the full stack from Python wrapper → AIDL → RoutingManagerReticulumProtocol interface.

Stale BLE path fix: The 60-second age check that was aggressively evicting legitimately persisted BLE paths at startup has been removed. Only entries with timestamp == 0 (the actual Reticulum bug) are now cleared.

Proactive path resolution: IdentityResolutionManager gains a 5-second-delayed startup sweep that requests paths for all ACTIVE/PENDING_IDENTITY contacts (with a 2-second stagger to avoid network flooding). Individual path requests are also triggered on contact addition (all three branches in ContactsViewModel), conversation open (MessagingViewModel), add-from-announce (AnnounceStreamViewModel), and save-to-contacts (ChatsViewModel), all delegating to the central requestPathForContact helper which guards with hasPath to avoid redundant requests.

Key changes to note:

  • retryResolution in IdentityResolutionManager had its resetContactForRetry call moved to the caller (ContactsViewModel); the contract is documented but not enforced by the API — a future call site could silently break the retry flow.
  • MessagingViewModel injects IdentityResolutionManager using a fully-qualified type name rather than an import, inconsistent with all other ViewModels in this PR.
  • All tests are updated to supply the new IdentityResolutionManager mock with appropriate stubs.

Confidence Score: 4/5

  • PR is safe to merge; the core persistence and path-resolution logic is sound, previous review issues have been addressed, and all tests are updated.
  • The path persistence logic is correct (persist-before-clear in shutdown, periodic persist during resolution loop). The startup sweep correctly stores its job for cancellation and places the stagger delay outside the inner try/catch. The hasPath guard prevents redundant network requests. The two remaining concerns — the fragile retryResolution API contract and the inline fully-qualified class name in MessagingViewModel — are minor and do not affect runtime behaviour.
  • IdentityResolutionManager.kt — the implicit precondition on retryResolution (caller must call resetContactForRetry first) should be watched as the codebase grows new retry call sites.

Important Files Changed

Filename Overview
python/reticulum_wrapper.py Adds persist_transport_data() method and calls RNS.Transport.persist_data() before clearing state in shutdown(); removes the overly-aggressive 60-second stale BLE path check (now only clears timestamp=0 entries). All exception paths are caught and reported.
app/src/main/java/com/lxmf/messenger/service/IdentityResolutionManager.kt Adds startup sweep job (correctly stored for cancellation), periodic persistence at the end of checkAndResolve, and requestPathForContact / requestPathsForAllContacts helpers. retryResolution now relies on callers to invoke resetContactForRetry first — contract is documented but not enforced.
app/src/main/java/com/lxmf/messenger/service/manager/RoutingManager.kt Adds persistTransportData() that inspects the Python return dict for soft failures and logs a warning — correctly handles both Python-level exceptions and soft {"success": false} responses.
app/src/main/java/com/lxmf/messenger/viewmodel/MessagingViewModel.kt Injects IdentityResolutionManager and delegates requestPathForContact on conversation open. Uses fully-qualified class name instead of an import, unlike every other ViewModel in this PR.
app/src/main/java/com/lxmf/messenger/viewmodel/ContactsViewModel.kt Fills three path-request TODOs (FullIdentity, ResolvedImmediately, AddedAsPending branches) and wires retry to identityResolutionManager.retryResolution. All branches use identityInput.destinationHash consistently.
app/src/main/java/com/lxmf/messenger/viewmodel/ChatsViewModel.kt Injects IdentityResolutionManager and requests a path after save-to-contacts succeeds. Minor cosmetic change collapses getContactLocation onto one line.
app/src/main/java/com/lxmf/messenger/viewmodel/AnnounceStreamViewModel.kt Injects IdentityResolutionManager and triggers requestPathForContact after a contact is added from the announce stream. Change is minimal and correct.
reticulum/src/main/java/com/lxmf/messenger/reticulum/protocol/ReticulumProtocol.kt Adds suspend fun persistTransportData() to the interface; implemented by ServiceReticulumProtocol and MockReticulumProtocol, and surfaced through the AIDL layer.
app/src/main/aidl/com/lxmf/messenger/IReticulumService.aidl Adds void persistTransportData() to the AIDL interface; documented and consistently implemented across ReticulumServiceBinder and ServiceReticulumProtocol.

Sequence Diagram

sequenceDiagram
    participant IRM as IdentityResolutionManager
    participant VM as ViewModel (Contacts/Chats/Messaging/Announce)
    participant RP as ReticulumProtocol
    participant RM as RoutingManager
    participant PY as reticulum_wrapper.py
    participant RNS as RNS.Transport

    Note over IRM: start() called
    IRM->>IRM: Launch resolutionJob (periodic)
    IRM->>IRM: Launch startupSweepJob (5s delay)

    Note over IRM: startupSweepJob fires after 5s
    IRM->>IRM: requestPathsForAllContacts()
    loop Each ACTIVE/PENDING contact (2s stagger)
        IRM->>RP: hasPath(destHashBytes)
        RP-->>IRM: true → continue (skip delay)
        IRM->>RP: requestPath(destHashBytes)
        Note over IRM: delay 2s
    end

    Note over IRM: checkAndResolve() fires periodically
    IRM->>IRM: resolve pending contacts
    IRM->>RP: persistTransportData()
    RP->>RM: persistTransportData()
    RM->>PY: persist_transport_data()
    PY->>RNS: Transport.persist_data()
    RNS-->>PY: ok / exception
    PY-->>RM: {"success": true/false}
    RM-->>RP: logs warning if false

    Note over VM: Contact add / conversation open / save-to-contacts / from-announce
    VM->>IRM: requestPathForContact(destinationHash)
    IRM->>RP: hasPath(destHashBytes)
    alt No existing path
        IRM->>RP: requestPath(destHashBytes)
    end

    Note over PY: shutdown() called
    PY->>RNS: Transport.persist_data()
    Note over PY: Persists BEFORE clearing state
    PY->>RNS: Clear singleton state
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: app/src/main/java/com/lxmf/messenger/viewmodel/MessagingViewModel.kt
Line: 92

Comment:
**Missing import — fully-qualified name used inline**

`IdentityResolutionManager` is referenced by its full package path here, while every other ViewModel added in this PR (`ContactsViewModel`, `ChatsViewModel`, `AnnounceStreamViewModel`) imports the class at the top of the file. Using the fully-qualified name inline compiles correctly but breaks visual consistency and makes the constructor parameter list harder to read.

```suggestion
        private val identityResolutionManager: IdentityResolutionManager,
```

Add the corresponding import alongside the other service imports already in the file:
```
import com.lxmf.messenger.service.IdentityResolutionManager
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: app/src/main/java/com/lxmf/messenger/service/IdentityResolutionManager.kt
Line: 230-233

Comment:
**Implicit precondition not enforced by API**

`retryResolution` now requires the caller to have already called `contactRepository.resetContactForRetry()` before invoking it (as noted in the Javadoc comment). `ContactsViewModel` currently does this correctly, but the contract is enforced only by documentation. Any future call site that invokes `retryResolution` directly — e.g., from a background job, a notification action, or a future ViewModel — will silently skip the status reset, leaving the contact stuck in `UNRESOLVED` while network path requests go out.

Consider either:
- Reinstating the `resetContactForRetry` call inside `retryResolution` itself (one source of truth), or
- Renaming the method to `requestPathForRetry` to signal it is only the network-layer half of a retry, and keeping `ContactsViewModel.retryIdentityResolution` as the single authoritative entry point for the full retry flow.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 8a0c250

…olutionManager

Detekt's NoRelaxedMocks rule forbids mockk(relaxed = true). Replace with
named mocks and explicit coEvery stubs for requestPathForContact and
retryResolution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sentry
Copy link
Copy Markdown
Contributor

sentry bot commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
python/reticulum_wrapper.py 50.00% 8 Missing ⚠️

📢 Thoughts on this report? Let us know!

- AnnounceStreamViewModel: delegate to identityResolutionManager.requestPathForContact()
  instead of raw reticulumProtocol.requestPath(), gaining the hasPath guard
- reticulum_wrapper.py: add RETICULUM_AVAILABLE guard to persist_transport_data()
- IdentityResolutionManager: move delay outside try block in requestPathsForAllContacts()
  so stagger always applies even when requestPath throws

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines +183 to +186
val contacts =
contactRepository.getContactsByStatus(
listOf(ContactStatus.ACTIVE, ContactStatus.PENDING_IDENTITY),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The code calls getContactsByStatus(), which fetches contacts for all identities, instead of getContactsByStatusForActiveIdentity() to scope requests to the active user.
Severity: MEDIUM

Suggested Fix

In IdentityResolutionManager.kt, replace the call to contactRepository.getContactsByStatus() with contactRepository.getContactsByStatusForActiveIdentity(). This will correctly limit path resolution requests to contacts belonging only to the active identity.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location:
app/src/main/java/com/lxmf/messenger/service/IdentityResolutionManager.kt#L183-L186

Potential issue: At startup, the `IdentityResolutionManager` calls
`contactRepository.getContactsByStatus()` to proactively request network paths. This
function returns contacts from all identities on the device, not just the currently
active one. This results in unnecessary network requests for contacts belonging to
inactive identities. While this doesn't cause a crash, it's a logical error because the
manager should operate only within the context of the active identity. A more
appropriate function, `getContactsByStatusForActiveIdentity()`, is available but is not
being used.

…odelTest

Detekt's NoRelaxedMocks rule. Use named identityResolutionManager mock
with coEvery stub for requestPathForContact.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

ContactsViewModel.retryIdentityResolution() already calls
resetContactForRetry() and checks the result before delegating to
IdentityResolutionManager.retryResolution(). The second call inside
retryResolution() was redundant. Remove it and document the caller
contract.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The one-shot startup sweep coroutine was launched without storing
its Job, so stop() couldn't cancel it. During Reticulum restarts,
the sweep would continue firing path requests against a shutting-down
service.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

Replace inline hasPath/requestPath logic with delegation to
IdentityResolutionManager.requestPathForContact(), consistent
with ContactsViewModel, ChatsViewModel, and AnnounceStreamViewModel.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

- ContactsViewModel.retryIdentityResolution: add Dispatchers.IO to
  avoid blocking AIDL call on main thread (potential ANR)
- RoutingManager.persistTransportData: inspect Python return dict
  so soft failures appear in logcat

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

Use identityInput.destinationHash consistently across all contact
addition branches, and add missing path request when a contact is
resolved immediately from the announce cache.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@torlando-tech
Copy link
Copy Markdown
Owner Author

@greptileai

Comment on lines +230 to 233
* Note: The caller (ContactsViewModel.retryIdentityResolution) is responsible
* for calling contactRepository.resetContactForRetry() before invoking this.
*/
suspend fun retryResolution(destinationHash: String) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicit precondition not enforced by API

retryResolution now requires the caller to have already called contactRepository.resetContactForRetry() before invoking it (as noted in the Javadoc comment). ContactsViewModel currently does this correctly, but the contract is enforced only by documentation. Any future call site that invokes retryResolution directly — e.g., from a background job, a notification action, or a future ViewModel — will silently skip the status reset, leaving the contact stuck in UNRESOLVED while network path requests go out.

Consider either:

  • Reinstating the resetContactForRetry call inside retryResolution itself (one source of truth), or
  • Renaming the method to requestPathForRetry to signal it is only the network-layer half of a retry, and keeping ContactsViewModel.retryIdentityResolution as the single authoritative entry point for the full retry flow.
Prompt To Fix With AI
This is a comment left during a code review.
Path: app/src/main/java/com/lxmf/messenger/service/IdentityResolutionManager.kt
Line: 230-233

Comment:
**Implicit precondition not enforced by API**

`retryResolution` now requires the caller to have already called `contactRepository.resetContactForRetry()` before invoking it (as noted in the Javadoc comment). `ContactsViewModel` currently does this correctly, but the contract is enforced only by documentation. Any future call site that invokes `retryResolution` directly — e.g., from a background job, a notification action, or a future ViewModel — will silently skip the status reset, leaving the contact stuck in `UNRESOLVED` while network path requests go out.

Consider either:
- Reinstating the `resetContactForRetry` call inside `retryResolution` itself (one source of truth), or
- Renaming the method to `requestPathForRetry` to signal it is only the network-layer half of a retry, and keeping `ContactsViewModel.retryIdentityResolution` as the single authoritative entry point for the full retry flow.

How can I resolve this? If you propose a fix, please make it concise.

@torlando-tech torlando-tech merged commit abfc2ec into main Mar 12, 2026
14 of 15 checks passed
@torlando-tech torlando-tech deleted the fix/path-persistence-proactive-resolution-main branch March 12, 2026 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant