Skip to content

feat(relays): bootstrap relay discovery instead of pinned default relays#610

Merged
grunch merged 8 commits into
mainfrom
refactor-relays-descovering
Jun 10, 2026
Merged

feat(relays): bootstrap relay discovery instead of pinned default relays#610
grunch merged 8 commits into
mainfrom
refactor-relays-descovering

Conversation

@Catrya

@Catrya Catrya commented Jun 9, 2026

Copy link
Copy Markdown
Member

fix #609
Replaces the hardcoded default relay that was always seeded into the user's relay list with a hidden bootstrap discovery layer. The visible relay list now contains only relays discovered from the Mostro's kind 10002 plus user-added relays. A small set of defensive bootstrap relays is used only to discover that list: at cold start, when every discovered relay is unreachable, or when switching Mostro instances.

Why

The default relay was persisted and always connected even when the active Mostro did not use it, leaving idle connections, diverging the visible list from what the Mostro endorses, and requiring UI special-casing.

How

  • Config.bootstrapRelays (relay.mostro.network, mostro-p2p.tech, nos.lol, relay.damus.io) live only at the NostrService connection level: never persisted, never shown.
  • Cold start: when settings has no relays, NostrService.init falls back to the bootstrap relays to fetch the kind 10002.
  • All-down recovery: RelayHealthMonitor (6s watchdog) re-engages bootstrap and re-subscribes (subscribeAll + kind 10002) when no relay is alive.
  • Instance switch: clears the list and ensures bootstrap connectivity for the new instance's kind 10002.

All engagement is additive (never disconnectFromRelays), so existing connections and subscriptions are never dropped. Retirement is logical only since dart_nostr has no per-relay disconnect.

How to test manually

Watch logs with flutter logs (or adb logcat | grep -i relay).

  1. Cold start: clear app data and launch. Logs show "falling back to bootstrap relays", then "Discovered N relays from Mostro 10002". Settings > Relays lists only the discovered relays (not nos.lol / relay.damus.io). The order book loads.
  2. Normal restart: close and reopen. The relay list appears immediately and logs do not show a bootstrap connection (discovered relays work).
  3. Instance switch: change the Mostro node. Logs show bootstrap connection then discovery of the new instance's relays; the list updates.
  4. Manual relay: add a valid wss relay. It persists and is not removed by a later kind 10002 sync.
  5. No event loss: create or take an order and exchange chat messages; events arrive normally.
  6. All-down recovery (optional): toggle WiFi off and on. Logs show "No live relays detected; engaging bootstrap relays" and events keep flowing after the network returns.

Summary by CodeRabbit

  • New Features

    • Added a relay health watchdog that auto-recovers connectivity and re-subscribes when relays are down.
    • Added defensive bootstrap discovery to find and activate relay lists when no user relays are available.
  • Improvements

    • Relay initialization now tolerates empty configured relay lists and uses bootstrap discovery as needed.
    • Default settings start with no pre-seeded relays.
  • Bug Fixes

    • Streamlined relay deactivation flow in the UI (removed special-case confirmation for default relays).
  • Documentation

    • Added architecture docs describing the bootstrap discovery layer and safety invariants.
  • Tests

    • New coverage for relay health recovery, bootstrap behavior, and relay connection invariants.

Catrya added 7 commits June 9, 2026 15:35
  - Add Config.bootstrapRelays with four defensive relays (relay.mostro.network, mostro-p2p.tech, nos.lol,
  relay.damus.io) used only to discover a Mostro's kind 10002 relay list at cold start or when no discovered relay is
  reachable
  - Add Config.relayDiscoveryTimeout (6s) as the grace period for discovered relays to connect before engaging
  bootstrap
  - Constants are inert in this commit: nothing references them yet, so there is no behavior change
  - Track relays that have sent data as alive in a _connectedRelays set, maintained from the relay connection
  callbacks, and expose connectedRelays and liveRelayCount for the upcoming relay-discovery watchdog
  - Fix onRelayConnectionDone log: it fires when the relay websocket closes (disconnect), not on successful connection
  as previously logged
  - Add ensureBootstrapConnectivity() that connects Config.bootstrapRelays additively (never disconnects existing
  relays), idempotent, reusing the proven updateSettings path
  - Add an init fail-safe: when settings has no relays (e.g. cold start before kind 10002 discovery), connect to
  Config.bootstrapRelays instead of throwing
  - Inert in steady state: ensureBootstrapConnectivity is not called yet and the init fail-safe only triggers with an
  empty relay list, which does not happen until defaults stop being seeded
  - Add RelayHealthMonitor: a periodic watchdog (relayDiscoveryTimeout interval) that, when no relay is alive (cold
  start where discovered relays never connect, or every discovered relay dropped at runtime), engages the bootstrap
  relays and re-establishes subscriptions
  - Recovery is additive (never disconnects) and reuses the proven subscribeAll() path, then re-subscribes to the
  Mostro kind 10002 list so relay discovery can recover
  - Guard against overlapping recoveries and cancel the timer on dispose; the check is biased toward acting since a
  false positive only costs an idle socket
  - Start the monitor from appInitializerProvider alongside the subscription manager
  - Builds the safety net before defaults stop being seeded; near-inert until then since discovered relays keep
  liveRelayCount above zero
  - Default-seed the settings relay list as empty: the active list now holds only relays discovered from the Mostro
  kind 10002 plus user-added relays
  - _loadRelays no longer injects a hardcoded default relay into state; it loads discovered Mostro relays from
  settings plus user relays
  - On Mostro instance switch, reset the relay list to empty and call ensureBootstrapConnectivity() so the new
  instance's kind 10002 can be fetched over bootstrap relays (the previous instance's relays may not carry it)
  - Cold start now connects to bootstrap relays via the NostrService init fail-safe, discovers the relay list, and
  persists it; normal launches connect only to discovered relays and never touch bootstrap, with the watchdog covering
  all-down recovery
  - Config.nostrRelays is left intact (still used by community fetching)
  - Log the discovered relay set from each Mostro kind 10002, plus which bootstrap relays are now idle (logical
  retirement; NostrService has no per-relay disconnect, so they are not actually closed)
  - Together with the bootstrap connection logs, the log now narrates the full cycle: connect bootstrap, discover,
  logically retire
  - Remove the isDefaultMostroRelay special-case and its blacklist default relay confirmation dialog from the relay
  toggle: bootstrap relays never appear in the visible list, so every listed relay is either a discovered Mostro relay
  or a user relay
  - Keep wouldLeaveNoActiveRelays as a generic last-active-relay guard
  - Add runnable NostrService tests for the new health surface (fresh service reports zero live relays, unmodifiable
  connectedRelays view, empty settings fallback) and bootstrap config invariants (wss-only relays, positive discovery
  timeout)
  - Document the Option C1 bootstrap discovery layer in RELAY_SYNC_IMPLEMENTATION.md: NostrService-level bootstrap,
  engagement points (cold start, watchdog, instance switch), safety invariants, liveness signal, and restart behavior
  - Note the bootstrap layer in CLAUDE.md's relay management section
  - Deep scenario tests are deferred: they require the relay mock infrastructure (currently disabled) which CI
  regenerates
  - Add RelayHealthMonitor watchdog tests: when no relay is alive it engages bootstrap connectivity, re-subscribes via
  subscribeAll, and re-establishes the kind 10002 subscription; it stays idle while a relay is alive or NostrService
  is not initialized, and skips relay-list re-subscription when no Mostro is configured
  - Add cold-start fail-safe tests: an empty relay list falls back to the bootstrap relays, a non-empty list is kept
  as-is
  - Expose RelayHealthMonitor.checkNow() (@VisibleForTesting) so the periodic timer need not be awaited
  - Extract NostrService.effectiveRelays() (pure, @VisibleForTesting) so the cold-start fallback is unit-testable
  without the dart_nostr singleton
  - Add a MockSubscriptionManagerSpy customMock to verify subscription calls without colliding with the existing
  hand-written mock
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3027eece-766b-401c-a2ca-0fc2ca88d1b1

📥 Commits

Reviewing files that changed from the base of the PR and between 8b441b4 and 0a6b838.

📒 Files selected for processing (2)
  • lib/features/relays/relay_health_monitor.dart
  • test/features/relays/relay_health_monitor_test.dart
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/features/relays/relay_health_monitor.dart

Walkthrough

This PR adds a hidden bootstrap relay discovery layer: config constants and a discovery timeout, NostrService liveness tracking and effective-relays fallback, RelaysNotifier integration to avoid seeding bootstrap relays into state, a periodic RelayHealthMonitor watchdog to recover from total relay failure, UI simplification removing default-relay special-casing, tests, and documentation.

Changes

Bootstrap Relay Discovery System

Layer / File(s) Summary
Configuration & Relay Liveness Contract
lib/core/config.dart, lib/services/nostr_service.dart
Config.bootstrapRelays and Config.relayDiscoveryTimeout added; NostrService adds _connectedRelays, connectedRelays, liveRelayCount, and effectiveRelays() fallback logic.
NostrService Bootstrap Connectivity
lib/services/nostr_service.dart
Connection callbacks update _connectedRelays; init() uses effective relay list (bootstraps when none configured); ensureBootstrapConnectivity() merges bootstrap relays additively with current config.
RelaysNotifier Bootstrap Integration
lib/features/relays/relays_notifier.dart, lib/features/settings/settings_notifier.dart
Stops injecting bootstrap relays into visible state; logs logical retirement of bootstrap relays from Mostro's 10002 list; _cleanAllRelaysAndResync() clears persisted state and calls ensureBootstrapConnectivity() before resync; settings default to empty relays.
RelayHealthMonitor Watchdog Service
lib/features/relays/relay_health_monitor.dart, lib/shared/providers/app_init_provider.dart
New periodic RelayHealthMonitor checks live relays, triggers bootstrap recovery via ensureBootstrapConnectivity(), calls subscribeAll(), and conditionally re-subscribes to Mostro relay list; started from app init provider.
UI Simplification & Default Relay Removal
lib/features/relays/widgets/relay_selector.dart
Removed special-case confirmation for the hardcoded default Mostro relay; non-user relays now blacklist directly when toggled.
Test Coverage for Bootstrap Relays
test/services/nostr_service_test.dart, test/features/relays/relay_health_monitor_test.dart, test/mocks.dart
Tests for NostrService cold-start defaults, connectedRelays immutability, effectiveRelays fallback, Config invariants, and RelayHealthMonitor recovery/no-op scenarios; adds MockSubscriptionManagerSpy.
Architecture Documentation
CLAUDE.md, docs/architecture/RELAY_SYNC_IMPLEMENTATION.md
Documents the bootstrap discovery layer, activation triggers (cold start, all-down, instance switch), safety invariants (avoid per-relay disconnect, watchdog floor), liveness tracking, and retirement semantics.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • MostroP2P/mobile#284: Related relay sync documentation edits for the bootstrap discovery behavior.
  • MostroP2P/mobile#272: Related work on Mostro/NIP-65 kind-10002 relay-list sync and relays_notifier logic.
  • MostroP2P/mobile#247: Related changes in NostrService handling of empty relay lists and fallback behavior.

Suggested reviewers

  • grunch
  • AndreaDiazCorreia

Poem

🐇 I nibble hidden bootstrap trails at night,
Quiet relays that spark the distant light,
When all falls silent and the sockets sleep,
I hop, I probe, I wake the network deep,
Invisible paws that find the path aright.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: replacing hardcoded default relays with a bootstrap discovery mechanism.
Linked Issues check ✅ Passed All objectives from issue #609 are implemented: bootstrap relays hidden from UI/persistence, activated only on cold start/all-down/instance switch, used solely for discovery, logically retired when discovered relays are reachable, with watchdog recovery.
Out of Scope Changes check ✅ Passed All changes are directly aligned with bootstrap implementation objectives; no unrelated modifications detected in Config additions, NostrService relay tracking, RelayHealthMonitor watchdog, RelaysNotifier logic, or UI simplification.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor-relays-descovering

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
lib/services/nostr_service.dart (1)

212-218: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clear _connectedRelays on disconnect for consistent state.

While the onRelayConnectionDone callback should remove entries as each socket closes, explicitly clearing _connectedRelays here guarantees consistent state regardless of callback timing or partial failures.

Proposed fix
 Future<void> disconnectFromRelays() async {
   if (!_isInitialized) return;

   await _nostr.services.relays.disconnectFromRelays();
+  _connectedRelays.clear();
   _isInitialized = false;
   logger.i('Disconnected from all relays');
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/services/nostr_service.dart` around lines 212 - 218, The
disconnectFromRelays method should explicitly clear the internal relay-tracking
map to ensure consistent state; update Future<void> disconnectFromRelays() to
call _connectedRelays.clear() (or assign a new empty map) after awaiting
_nostr.services.relays.disconnectFromRelays() and before/alongside setting
_isInitialized = false so that _connectedRelays is empty regardless of
onRelayConnectionDone callback timing or partial failures.
lib/features/relays/relays_notifier.dart (1)

691-692: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Comment should be in English.

The comment on line 691 is in Spanish. As per coding guidelines, all code comments and documentation must be in English.

Proposed fix
-        // 🔥 RESET COMPLETO: Limpiar todos los relays y hacer sync fresco
+        // Full reset: clear all relays and perform fresh sync
         _cleanAllRelaysAndResync();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/features/relays/relays_notifier.dart` around lines 691 - 692, The inline
comment preceding the call to _cleanAllRelaysAndResync() is written in Spanish;
update that comment to English to follow project guidelines (e.g., replace "🔥
RESET COMPLETO: Limpiar todos los relays y hacer sync fresco" with an English
equivalent like "🔥 FULL RESET: clear all relays and perform a fresh sync") and
ensure the surrounding comments in relays_notifier.dart are consistent in
English.

Source: Coding guidelines

🧹 Nitpick comments (1)
lib/services/nostr_service.dart (1)

224-237: 💤 Low value

Consider normalizing relay URLs before deduplication.

The Set-based merge uses string equality, so semantically identical URLs with different formats (e.g., trailing slash) would be treated as distinct. While Config.bootstrapRelays lacks trailing slashes, URLs from Mostro kind-10002 discovery might vary. Normalizing ensures robust deduplication.

Possible approach
+String _normalizeRelayUrl(String url) =>
+    url.endsWith('/') ? url.substring(0, url.length - 1) : url;
+
 Future<void> ensureBootstrapConnectivity() async {
-  final current = settings.relays;
-  final merged = {...current, ...Config.bootstrapRelays}.toList();
+  final normalizedCurrent = settings.relays.map(_normalizeRelayUrl).toSet();
+  final normalizedBootstrap = Config.bootstrapRelays.map(_normalizeRelayUrl).toSet();
+  final merged = {...normalizedCurrent, ...normalizedBootstrap}.toList();

-  if (ListEquality().equals(current, merged)) {
+  if (normalizedBootstrap.every(normalizedCurrent.contains)) {
     logger.d('Bootstrap relays already part of the active relay set');
     return;
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/services/nostr_service.dart` around lines 224 - 237, In
ensureBootstrapConnectivity, normalize relay URLs from settings.relays and
Config.bootstrapRelays before merging/dedup so semantically identical URLs
(e.g., with/without trailing slash or case differences) are treated equal;
implement a small canonicalization step (trim trailing slashes, lowercase
scheme/host, remove default ports) applied to each URL, use the normalized
values to build the merged set and then map back to the original preferred
representation when calling updateSettings(settings.copyWith(relays: merged));
ensure ListEquality() compares normalized lists rather than raw strings.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@lib/features/relays/relays_notifier.dart`:
- Around line 691-692: The inline comment preceding the call to
_cleanAllRelaysAndResync() is written in Spanish; update that comment to English
to follow project guidelines (e.g., replace "🔥 RESET COMPLETO: Limpiar todos
los relays y hacer sync fresco" with an English equivalent like "🔥 FULL RESET:
clear all relays and perform a fresh sync") and ensure the surrounding comments
in relays_notifier.dart are consistent in English.

In `@lib/services/nostr_service.dart`:
- Around line 212-218: The disconnectFromRelays method should explicitly clear
the internal relay-tracking map to ensure consistent state; update Future<void>
disconnectFromRelays() to call _connectedRelays.clear() (or assign a new empty
map) after awaiting _nostr.services.relays.disconnectFromRelays() and
before/alongside setting _isInitialized = false so that _connectedRelays is
empty regardless of onRelayConnectionDone callback timing or partial failures.

---

Nitpick comments:
In `@lib/services/nostr_service.dart`:
- Around line 224-237: In ensureBootstrapConnectivity, normalize relay URLs from
settings.relays and Config.bootstrapRelays before merging/dedup so semantically
identical URLs (e.g., with/without trailing slash or case differences) are
treated equal; implement a small canonicalization step (trim trailing slashes,
lowercase scheme/host, remove default ports) applied to each URL, use the
normalized values to build the merged set and then map back to the original
preferred representation when calling updateSettings(settings.copyWith(relays:
merged)); ensure ListEquality() compares normalized lists rather than raw
strings.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0342edfe-5329-48ac-9745-c91ce3d09137

📥 Commits

Reviewing files that changed from the base of the PR and between 215d022 and 8b441b4.

📒 Files selected for processing (12)
  • CLAUDE.md
  • docs/architecture/RELAY_SYNC_IMPLEMENTATION.md
  • lib/core/config.dart
  • lib/features/relays/relay_health_monitor.dart
  • lib/features/relays/relays_notifier.dart
  • lib/features/relays/widgets/relay_selector.dart
  • lib/features/settings/settings_notifier.dart
  • lib/services/nostr_service.dart
  • lib/shared/providers/app_init_provider.dart
  • test/features/relays/relay_health_monitor_test.dart
  • test/mocks.dart
  • test/services/nostr_service_test.dart

@mostronatorcoder mostronatorcoder Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the direction here, but I do see one real correctness issue that I would block on before merge.

Blocking issue:

  • RelayHealthMonitor treats liveRelayCount > 0 as globally healthy, but ensureBootstrapConnectivity() permanently merges Config.bootstrapRelays into settings.relays, and NostrService counts any relay that sends traffic as alive.
  • After the first bootstrap engagement, a live bootstrap relay can therefore keep liveRelayCount above zero even if every discovered relay for the current Mostro instance is still down.
  • At that point the watchdog stops recovering because the system looks healthy by the global metric, but the app may in practice be surviving only on bootstrap connectivity. That breaks the intended contract described in this PR, where bootstrap is only a discovery/recovery layer and discovered relays should remain the real operating set.

Why this matters:

  • The PR’s liveness policy is specifically about recovering the Mostro-discovered relay layer when it is absent or dead.
  • With the current metric, bootstrap traffic can mask a total failure of the discovered layer and create a false-green steady state.
  • In that state, the monitor no longer re-engages recovery even though the relay set the app is supposed to operate on has not actually recovered.

What I would want instead:

  • Either exclude bootstrap relays from the health signal used by RelayHealthMonitor, or track discovered-relay liveness separately and gate recovery on that.
  • In other words, the watchdog should not consider “only bootstrap is alive” equivalent to “relay discovery recovered successfully”.

Because this changes the core recovery contract, I think it is a real blocker rather than a follow-up nit.

  - A live bootstrap-only socket could keep the global count above zero and mask a fully dead discovered-relay layer,
  stopping recovery while the app survived on bootstrap alone
  - Gate recovery on whether any operating relay (settings.relays: discovered + user, never bootstrap) is alive,
  intersecting it with connectedRelays
  - Add a test for the only-bootstrap-alive scenario

@mostronatorcoder mostronatorcoder Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-reviewed the new head and the blocker I raised is resolved on this revision.

What changed and why it clears it:

  • RelayHealthMonitor no longer gates on the global liveRelayCount.
  • It now checks whether any relay in the operating set (settings.relays, i.e. discovered + user relays) is also present in connectedRelays.
  • That means a bootstrap-only live socket no longer masks a dead discovered layer, while a relay that is both bootstrap and discovered still counts correctly because it is part of the operating set.
  • The new watchdog test for the “only bootstrap alive” case is exactly the regression coverage I wanted for this failure mode.

I also re-checked the accepted tradeoff you described: if the discovered layer is dead and the Mostro does not publish to bootstrap, the watchdog will retry every relayDiscoveryTimeout. I agree that is an honest liveness-biased tradeoff rather than a correctness problem in this PR.

I do not see another blocker on the current revision.

Approved.

@Catrya Catrya requested a review from AndreaDiazCorreia June 10, 2026 00:29

@grunch grunch left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great job! utACK

@grunch grunch merged commit e776adc into main Jun 10, 2026
2 checks passed
@grunch grunch deleted the refactor-relays-descovering branch June 10, 2026 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add hidden bootstrap relays used only to discover Mostro's kind 10002, never shown to the user

2 participants