Skip to content

Contain channel connection failures — a misconfigured channel must not crash the daemon #1077

@Aaronontheweb

Description

@Aaronontheweb

Problem

A channel that fails its initial connection rethrows from IHostedService.StartAsync. An unhandled exception from StartAsync aborts the .NET Generic Host and triggers Akka CoordinatedShutdown — so a single misconfigured channel takes the entire daemon down (Slack, reminders, HTTP API, everything).

This surfaced in #1033: a Discord bot without the Message Content privileged intent gets gateway close 4014 ("Disallowed intent(s)"), and the whole daemon shuts down.

Both DiscordChannel.StartAsync and SlackChannel.StartAsync had this throw;. The catch blocks emitted a Warning-severity ChannelDisconnected alert and then rethrew — warn, then kill the process. That mismatch shows the daemon-wide blast radius was never intended. This is V1 prototype behavior we are reversing: a misconfigured channel should degrade in isolation, not crash the process.

Desired behavior

  • A channel that can't connect leaves the daemon and all other subsystems running.
  • The failed channel reports disconnected (with a reason) via netclaw status and a channel.disconnected operational alert.
  • Transient failures (network/gateway blip) retry on a background reconnect loop with capped backoff.
  • Fatal failures (bad token, disallowed/invalid intents, missing OAuth scope) stay disconnected until the operator fixes config and restarts — retrying would loop forever.

Status

Fix implemented (commit 21dc4f00):

  • ChannelConnectException + per-channel Fatal/Transient classifiers (Discord gateway close codes; Slack auth error codes).
  • DiscordNetGatewayClient observes the gateway Disconnected event and fails fast on a fatal close instead of blocking on the 30s readiness timeout.
  • DiscordChannel/SlackChannel StartAsync degrade instead of rethrowing; transient failures reconnect with backoff.

Related: #1033

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingchannelsDiscord, Slack, and other channels.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions