feat(relays): bootstrap relay discovery instead of pinned default relays#610
Conversation
- Add Config.bootstrapRelays with four defensive relays (relay.mostro.network, mostro-p2p.tech, nos.lol, relay.damus.io) used only to discover a Mostro's kind 10002 relay list at cold start or when no discovered relay is reachable - Add Config.relayDiscoveryTimeout (6s) as the grace period for discovered relays to connect before engaging bootstrap - Constants are inert in this commit: nothing references them yet, so there is no behavior change
- Track relays that have sent data as alive in a _connectedRelays set, maintained from the relay connection callbacks, and expose connectedRelays and liveRelayCount for the upcoming relay-discovery watchdog - Fix onRelayConnectionDone log: it fires when the relay websocket closes (disconnect), not on successful connection as previously logged - Add ensureBootstrapConnectivity() that connects Config.bootstrapRelays additively (never disconnects existing relays), idempotent, reusing the proven updateSettings path - Add an init fail-safe: when settings has no relays (e.g. cold start before kind 10002 discovery), connect to Config.bootstrapRelays instead of throwing - Inert in steady state: ensureBootstrapConnectivity is not called yet and the init fail-safe only triggers with an empty relay list, which does not happen until defaults stop being seeded
- Add RelayHealthMonitor: a periodic watchdog (relayDiscoveryTimeout interval) that, when no relay is alive (cold start where discovered relays never connect, or every discovered relay dropped at runtime), engages the bootstrap relays and re-establishes subscriptions - Recovery is additive (never disconnects) and reuses the proven subscribeAll() path, then re-subscribes to the Mostro kind 10002 list so relay discovery can recover - Guard against overlapping recoveries and cancel the timer on dispose; the check is biased toward acting since a false positive only costs an idle socket - Start the monitor from appInitializerProvider alongside the subscription manager - Builds the safety net before defaults stop being seeded; near-inert until then since discovered relays keep liveRelayCount above zero
- Default-seed the settings relay list as empty: the active list now holds only relays discovered from the Mostro kind 10002 plus user-added relays - _loadRelays no longer injects a hardcoded default relay into state; it loads discovered Mostro relays from settings plus user relays - On Mostro instance switch, reset the relay list to empty and call ensureBootstrapConnectivity() so the new instance's kind 10002 can be fetched over bootstrap relays (the previous instance's relays may not carry it) - Cold start now connects to bootstrap relays via the NostrService init fail-safe, discovers the relay list, and persists it; normal launches connect only to discovered relays and never touch bootstrap, with the watchdog covering all-down recovery - Config.nostrRelays is left intact (still used by community fetching)
- Log the discovered relay set from each Mostro kind 10002, plus which bootstrap relays are now idle (logical retirement; NostrService has no per-relay disconnect, so they are not actually closed) - Together with the bootstrap connection logs, the log now narrates the full cycle: connect bootstrap, discover, logically retire - Remove the isDefaultMostroRelay special-case and its blacklist default relay confirmation dialog from the relay toggle: bootstrap relays never appear in the visible list, so every listed relay is either a discovered Mostro relay or a user relay - Keep wouldLeaveNoActiveRelays as a generic last-active-relay guard
- Add runnable NostrService tests for the new health surface (fresh service reports zero live relays, unmodifiable connectedRelays view, empty settings fallback) and bootstrap config invariants (wss-only relays, positive discovery timeout) - Document the Option C1 bootstrap discovery layer in RELAY_SYNC_IMPLEMENTATION.md: NostrService-level bootstrap, engagement points (cold start, watchdog, instance switch), safety invariants, liveness signal, and restart behavior - Note the bootstrap layer in CLAUDE.md's relay management section - Deep scenario tests are deferred: they require the relay mock infrastructure (currently disabled) which CI regenerates
- Add RelayHealthMonitor watchdog tests: when no relay is alive it engages bootstrap connectivity, re-subscribes via subscribeAll, and re-establishes the kind 10002 subscription; it stays idle while a relay is alive or NostrService is not initialized, and skips relay-list re-subscription when no Mostro is configured - Add cold-start fail-safe tests: an empty relay list falls back to the bootstrap relays, a non-empty list is kept as-is - Expose RelayHealthMonitor.checkNow() (@VisibleForTesting) so the periodic timer need not be awaited - Extract NostrService.effectiveRelays() (pure, @VisibleForTesting) so the cold-start fallback is unit-testable without the dart_nostr singleton - Add a MockSubscriptionManagerSpy customMock to verify subscription calls without colliding with the existing hand-written mock
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThis PR adds a hidden bootstrap relay discovery layer: config constants and a discovery timeout, NostrService liveness tracking and effective-relays fallback, RelaysNotifier integration to avoid seeding bootstrap relays into state, a periodic RelayHealthMonitor watchdog to recover from total relay failure, UI simplification removing default-relay special-casing, tests, and documentation. ChangesBootstrap Relay Discovery System
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
lib/services/nostr_service.dart (1)
212-218:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winClear
_connectedRelayson disconnect for consistent state.While the
onRelayConnectionDonecallback should remove entries as each socket closes, explicitly clearing_connectedRelayshere guarantees consistent state regardless of callback timing or partial failures.Proposed fix
Future<void> disconnectFromRelays() async { if (!_isInitialized) return; await _nostr.services.relays.disconnectFromRelays(); + _connectedRelays.clear(); _isInitialized = false; logger.i('Disconnected from all relays'); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/services/nostr_service.dart` around lines 212 - 218, The disconnectFromRelays method should explicitly clear the internal relay-tracking map to ensure consistent state; update Future<void> disconnectFromRelays() to call _connectedRelays.clear() (or assign a new empty map) after awaiting _nostr.services.relays.disconnectFromRelays() and before/alongside setting _isInitialized = false so that _connectedRelays is empty regardless of onRelayConnectionDone callback timing or partial failures.lib/features/relays/relays_notifier.dart (1)
691-692:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winComment should be in English.
The comment on line 691 is in Spanish. As per coding guidelines, all code comments and documentation must be in English.
Proposed fix
- // 🔥 RESET COMPLETO: Limpiar todos los relays y hacer sync fresco + // Full reset: clear all relays and perform fresh sync _cleanAllRelaysAndResync();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/features/relays/relays_notifier.dart` around lines 691 - 692, The inline comment preceding the call to _cleanAllRelaysAndResync() is written in Spanish; update that comment to English to follow project guidelines (e.g., replace "🔥 RESET COMPLETO: Limpiar todos los relays y hacer sync fresco" with an English equivalent like "🔥 FULL RESET: clear all relays and perform a fresh sync") and ensure the surrounding comments in relays_notifier.dart are consistent in English.Source: Coding guidelines
🧹 Nitpick comments (1)
lib/services/nostr_service.dart (1)
224-237: 💤 Low valueConsider normalizing relay URLs before deduplication.
The Set-based merge uses string equality, so semantically identical URLs with different formats (e.g., trailing slash) would be treated as distinct. While
Config.bootstrapRelayslacks trailing slashes, URLs from Mostro kind-10002 discovery might vary. Normalizing ensures robust deduplication.Possible approach
+String _normalizeRelayUrl(String url) => + url.endsWith('/') ? url.substring(0, url.length - 1) : url; + Future<void> ensureBootstrapConnectivity() async { - final current = settings.relays; - final merged = {...current, ...Config.bootstrapRelays}.toList(); + final normalizedCurrent = settings.relays.map(_normalizeRelayUrl).toSet(); + final normalizedBootstrap = Config.bootstrapRelays.map(_normalizeRelayUrl).toSet(); + final merged = {...normalizedCurrent, ...normalizedBootstrap}.toList(); - if (ListEquality().equals(current, merged)) { + if (normalizedBootstrap.every(normalizedCurrent.contains)) { logger.d('Bootstrap relays already part of the active relay set'); return; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/services/nostr_service.dart` around lines 224 - 237, In ensureBootstrapConnectivity, normalize relay URLs from settings.relays and Config.bootstrapRelays before merging/dedup so semantically identical URLs (e.g., with/without trailing slash or case differences) are treated equal; implement a small canonicalization step (trim trailing slashes, lowercase scheme/host, remove default ports) applied to each URL, use the normalized values to build the merged set and then map back to the original preferred representation when calling updateSettings(settings.copyWith(relays: merged)); ensure ListEquality() compares normalized lists rather than raw strings.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@lib/features/relays/relays_notifier.dart`:
- Around line 691-692: The inline comment preceding the call to
_cleanAllRelaysAndResync() is written in Spanish; update that comment to English
to follow project guidelines (e.g., replace "🔥 RESET COMPLETO: Limpiar todos
los relays y hacer sync fresco" with an English equivalent like "🔥 FULL RESET:
clear all relays and perform a fresh sync") and ensure the surrounding comments
in relays_notifier.dart are consistent in English.
In `@lib/services/nostr_service.dart`:
- Around line 212-218: The disconnectFromRelays method should explicitly clear
the internal relay-tracking map to ensure consistent state; update Future<void>
disconnectFromRelays() to call _connectedRelays.clear() (or assign a new empty
map) after awaiting _nostr.services.relays.disconnectFromRelays() and
before/alongside setting _isInitialized = false so that _connectedRelays is
empty regardless of onRelayConnectionDone callback timing or partial failures.
---
Nitpick comments:
In `@lib/services/nostr_service.dart`:
- Around line 224-237: In ensureBootstrapConnectivity, normalize relay URLs from
settings.relays and Config.bootstrapRelays before merging/dedup so semantically
identical URLs (e.g., with/without trailing slash or case differences) are
treated equal; implement a small canonicalization step (trim trailing slashes,
lowercase scheme/host, remove default ports) applied to each URL, use the
normalized values to build the merged set and then map back to the original
preferred representation when calling updateSettings(settings.copyWith(relays:
merged)); ensure ListEquality() compares normalized lists rather than raw
strings.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0342edfe-5329-48ac-9745-c91ce3d09137
📒 Files selected for processing (12)
CLAUDE.mddocs/architecture/RELAY_SYNC_IMPLEMENTATION.mdlib/core/config.dartlib/features/relays/relay_health_monitor.dartlib/features/relays/relays_notifier.dartlib/features/relays/widgets/relay_selector.dartlib/features/settings/settings_notifier.dartlib/services/nostr_service.dartlib/shared/providers/app_init_provider.darttest/features/relays/relay_health_monitor_test.darttest/mocks.darttest/services/nostr_service_test.dart
There was a problem hiding this comment.
I like the direction here, but I do see one real correctness issue that I would block on before merge.
Blocking issue:
RelayHealthMonitortreatsliveRelayCount > 0as globally healthy, butensureBootstrapConnectivity()permanently mergesConfig.bootstrapRelaysintosettings.relays, andNostrServicecounts any relay that sends traffic as alive.- After the first bootstrap engagement, a live bootstrap relay can therefore keep
liveRelayCountabove zero even if every discovered relay for the current Mostro instance is still down. - At that point the watchdog stops recovering because the system looks healthy by the global metric, but the app may in practice be surviving only on bootstrap connectivity. That breaks the intended contract described in this PR, where bootstrap is only a discovery/recovery layer and discovered relays should remain the real operating set.
Why this matters:
- The PR’s liveness policy is specifically about recovering the Mostro-discovered relay layer when it is absent or dead.
- With the current metric, bootstrap traffic can mask a total failure of the discovered layer and create a false-green steady state.
- In that state, the monitor no longer re-engages recovery even though the relay set the app is supposed to operate on has not actually recovered.
What I would want instead:
- Either exclude bootstrap relays from the health signal used by
RelayHealthMonitor, or track discovered-relay liveness separately and gate recovery on that. - In other words, the watchdog should not consider “only bootstrap is alive” equivalent to “relay discovery recovered successfully”.
Because this changes the core recovery contract, I think it is a real blocker rather than a follow-up nit.
- A live bootstrap-only socket could keep the global count above zero and mask a fully dead discovered-relay layer, stopping recovery while the app survived on bootstrap alone - Gate recovery on whether any operating relay (settings.relays: discovered + user, never bootstrap) is alive, intersecting it with connectedRelays - Add a test for the only-bootstrap-alive scenario
There was a problem hiding this comment.
I re-reviewed the new head and the blocker I raised is resolved on this revision.
What changed and why it clears it:
RelayHealthMonitorno longer gates on the globalliveRelayCount.- It now checks whether any relay in the operating set (
settings.relays, i.e. discovered + user relays) is also present inconnectedRelays. - That means a bootstrap-only live socket no longer masks a dead discovered layer, while a relay that is both bootstrap and discovered still counts correctly because it is part of the operating set.
- The new watchdog test for the “only bootstrap alive” case is exactly the regression coverage I wanted for this failure mode.
I also re-checked the accepted tradeoff you described: if the discovered layer is dead and the Mostro does not publish to bootstrap, the watchdog will retry every relayDiscoveryTimeout. I agree that is an honest liveness-biased tradeoff rather than a correctness problem in this PR.
I do not see another blocker on the current revision.
Approved.
fix #609
Replaces the hardcoded default relay that was always seeded into the user's relay list with a hidden bootstrap discovery layer. The visible relay list now contains only relays discovered from the Mostro's kind 10002 plus user-added relays. A small set of defensive bootstrap relays is used only to discover that list: at cold start, when every discovered relay is unreachable, or when switching Mostro instances.
Why
The default relay was persisted and always connected even when the active Mostro did not use it, leaving idle connections, diverging the visible list from what the Mostro endorses, and requiring UI special-casing.
How
Config.bootstrapRelays(relay.mostro.network, mostro-p2p.tech, nos.lol, relay.damus.io) live only at the NostrService connection level: never persisted, never shown.All engagement is additive (never disconnectFromRelays), so existing connections and subscriptions are never dropped. Retirement is logical only since dart_nostr has no per-relay disconnect.
How to test manually
Watch logs with
flutter logs(oradb logcat | grep -i relay).Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Documentation
Tests