Skip to content

Defer bulk peer identity restore after prioritizing contacts#604

Merged
torlando-tech merged 4 commits intotorlando-tech:mainfrom
MatthieuTexier:fix/startup-restore-contacts-first
Mar 8, 2026
Merged

Defer bulk peer identity restore after prioritizing contacts#604
torlando-tech merged 4 commits intotorlando-tech:mainfrom
MatthieuTexier:fix/startup-restore-contacts-first

Conversation

@MatthieuTexier
Copy link
Copy Markdown
Contributor

Summary

  • restore active user contacts before the generic peer identity cache during startup
  • defer bulk peer identity restoration by 5 seconds to reduce launch contention
  • skip bulk identities already restored from prioritized contacts

Problem

Cold start regressed badly on devices with a large peer cache. Investigation showed startup was eagerly restoring ~2500 peer identities before the app became usable, which kept the splash screen visible for far too long.

Solution

This keeps the existing restore pipeline but changes ordering and timing:

  • fetch active contacts with public keys from the contacts table
  • compute their identity hashes and restore them first
  • delay the generic batched peer restore
  • filter out identities already restored via contacts

This preserves quick access to the names that matter most to the user while moving the expensive network-wide cache restore out of the critical startup path.

Validation

  • ./gradlew :app:compileNoSentryDebugKotlin
  • ./gradlew :app:testNoSentryDebugUnitTest
  • measured startup on One+: roughly 16-18s splash before, now about 5s

Notes

The expected functional tradeoff is that some non-saved peers may temporarily display as raw hashes until the deferred bulk restore completes, while saved contacts should resolve first.

Restore active user contacts first during startup so important names are
available quickly, then delay bulk peer identity restoration by 5 seconds.
This reduces startup contention when the peer identity cache is large while
preserving the existing restore pipeline.

The bulk restore now skips identity hashes already restored from contacts
and keeps the existing batched pagination for generic peers.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR improves cold-start performance by prioritizing restoration of saved-contact identities before the full peer-identity bulk restore, and deferring the bulk restore by 5 seconds to reduce launch contention. The approach is sound and the measured improvement (~16-18 s → ~5 s splash) validates the change. The refactored restorePeerIdentitiesInBatches loop is cleaner than the original and the alreadyRestoredIdentityHashes deduplication is correct (both sides use the same Reticulum identity hash format: first 16 bytes of SHA-256 of public key).

Key findings:

  • restoreContactIdentities bypasses the 500-item OOM batch limit — all contacts are sent to restorePeerIdentities in a single IPC call, which is the same pattern the bulk restore deliberately avoided. Users with a large contacts list could see the same memory spike on the contact restore step.
  • totalRestored under-counts when all batch entries are filtered — the final log message will be inaccurate any time an entire raw batch is skipped due to deduplication, making it harder to verify restoration completeness during debugging.
  • The new getRestorableContactsForIdentity DAO query and getRestorableContactIdentitiesForActiveIdentity repository method are well-structured; the priority ordering (relay → pinned → lastInteraction → added) matches the PR intent.

Confidence Score: 3/5

  • The PR improves startup performance but introduces a potential OOM regression for users with large contact lists; functional correctness is otherwise preserved.
  • The core sequencing and deduplication logic is correct. However, restoreContactIdentities sends all contacts in a single unbatched IPC call — directly at odds with the 500-item OOM guard that already exists in restorePeerIdentitiesInBatches. For typical users (< 100 contacts) this is fine, but for power users or imported address books the risk is real. The logging inaccuracy when batches are fully filtered is a lesser concern but still worth fixing before merge.
  • app/src/main/java/com/lxmf/messenger/ColumbaApplication.kt — specifically the restoreContactIdentities function.

Important Files Changed

Filename Overview
app/src/main/java/com/lxmf/messenger/ColumbaApplication.kt Core startup orchestration change: contact identities are now restored first, then a 5-second delay is applied before the bulk batched restore. The refactored restorePeerIdentitiesInBatches is correct and cleaner than the original, but restoreContactIdentities sends all contacts in a single unbatched IPC call, bypassing the 500-item OOM guard that restorePeerIdentitiesInBatches intentionally enforces.
data/src/main/java/com/lxmf/messenger/data/db/dao/ContactDao.kt New getRestorableContactsForIdentity query filters active contacts with a non-null public key for a given identity, ordered by relay/pinned/interaction priority. No LIMIT clause, but contact lists are typically small. Filtering and ordering logic look correct.
data/src/main/java/com/lxmf/messenger/data/repository/ContactRepository.kt New getRestorableContactIdentitiesForActiveIdentity correctly computes identity hashes via computeIdentityHash, which matches the peerHash format in peer_identities table. No issues found in this file.

Sequence Diagram

sequenceDiagram
    participant App as ColumbaApplication
    participant CR as ContactRepository
    participant SP as ServiceReticulumProtocol
    participant Conv as ConversationRepository

    App->>App: restorePeerIdentities()
    App->>App: launch(Dispatchers.IO)

    Note over App: Step 1 – Priority contacts
    App->>CR: getRestorableContactIdentitiesForActiveIdentity()
    CR-->>App: List<Pair<identityHash, publicKey>>
    App->>SP: restorePeerIdentities(all contacts in ONE call)
    SP-->>App: Result<Int> (count restored)
    App->>App: build restoredContactIdentityHashes set (if success)

    Note over App: 5-second delay
    App->>App: delay(5000 ms)

    Note over App: Step 2 – Bulk batched restore
    loop Each batch of 500 raw entries
        App->>Conv: getPeerIdentitiesBatch(500, offset)
        Conv-->>App: List<Pair<peerHash, publicKey>>
        App->>App: filter out alreadyRestoredIdentityHashes
        alt batch not empty after filter
            App->>SP: restorePeerIdentities(filtered batch)
            SP-->>App: Result<Int>
        end
        App->>App: offset += 500
    end
Loading

Last reviewed commit: ac0f9aa

@sentry
Copy link
Copy Markdown
Contributor

sentry bot commented Mar 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@MatthieuTexier
Copy link
Copy Markdown
Contributor Author

@greptile

@MatthieuTexier
Copy link
Copy Markdown
Contributor Author

test failure is not reproducible locally, works for me.

@serialrf433
Copy link
Copy Markdown
Contributor

The approach is sound and the measured improvement (~16-18 s → ~5 s splash) validates the change.

Thank you so much for this. I had this random issue so many times. The application logs was not helpful to make a good issue report here.

WHERE identityHash = :identityHash
AND status = :activeStatus
AND publicKey IS NOT NULL
ORDER BY isMyRelay DESC, isPinned DESC, lastInteractionTimestamp DESC, addedTimestamp DESC
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love this order by :)

@torlando-tech torlando-tech merged commit bb81dd0 into torlando-tech:main Mar 8, 2026
22 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants