Skip to content

hermes profile create --clone copies exclusive platform credentials causing multi-profile gateway conflicts #17080

@wufloor

Description

@wufloor

Symptom

After cloning an existing profile with hermes profile create <new> --clone, starting gateways for multiple profiles simultaneously causes the later gateway to fail health checks:

API Error 500: Gateway health check timed out after 15000ms

In hermes-web-ui, this surfaces as a "profile loading error" that is difficult to diagnose.

Root Cause

--clone currently performs a full copy: it copies the source profile's .env and config.yaml verbatim into the new profile directory. This means:

  • WEIXIN_TOKEN / WEIXIN_ACCOUNT_ID / TELEGRAM_BOT_TOKEN / DISCORD_BOT_TOKEN / SLACK_APP_TOKEN / SIGNAL_PHONE_NUMBER and other exclusive platform credentials are copied as-is
  • config.yaml entries like platforms.<name>.enabled: true are preserved
  • Embedded credentials in config.yaml (e.g., platforms.weixin.token, platforms.weixin.extra.account_id) are also copied

Exclusive platform credentials are fundamentally "one-to-one identity bindings":

  • A Weixin token maps to exactly one bot instance
  • A Telegram bot token can only getUpdates from one process at a time
  • Discord/Slack long-lived connections are mutually exclusive
  • Signal, WhatsApp Business, etc. follow the same pattern

When multiple profiles hold the same token and start simultaneously, hermes-agent's platform adapter fails during initialization or scoped lock acquisition (the scoped lock mechanism is described in #4587). The error is swallowed by the gateway process; the outer layer only sees a health check timeout.

Reproduction

hermes profile create profile-a   # Initial profile, configure Weixin
# Set WEIXIN_TOKEN=xxx, WEIXIN_ACCOUNT_ID=yyy in profile-a/.env
# Set platforms.weixin.enabled: true in profile-a/config.yaml

hermes profile create profile-b --clone   # Clone
# profile-b/.env also has WEIXIN_TOKEN=xxx
# profile-b/config.yaml also has platforms.weixin.enabled: true

hermes gateway start profile-a   # OK
hermes gateway start profile-b   # Fails: health check timeout

Expected Behavior

Option A: Exclude exclusive credentials during clone (recommended)

New CLI flag:

hermes profile create profile-b --clone --exclude-platform-credentials

Or exclude by default (breaking change, but aligns with the semantics "clone = reuse model/tool config, not identity"):

  • When copying .env, skip keys matching ^(TELEGRAM|DISCORD|SLACK|WHATSAPP|SIGNAL|WEIXIN|FEISHU)_ (aligned with the 7 adapters in gateway/platforms/*.py that call _acquire_platform_lock)
  • When copying config.yaml, force platforms.<exclusive>.enabled to false
  • Also strip embedded credentials under exclusive platform nodes in config.yaml (e.g., platforms.weixin.token, platforms.weixin.extra.account_id, platforms.telegram.bot_token) to prevent reuse of the source profile's identity when the user later re-enables the platform
  • CLI output should clearly state which keys were skipped, prompting the user to configure them separately in the new profile

Model provider API keys (OPENAI_API_KEY / ANTHROPIC_API_KEY etc.) and tool config (BROWSER_HEADLESS / TERMINAL_DEFAULT_SHELL) should still be copied — they are safely shareable.

Option B: Graceful degradation on token conflict

When gateway startup detects that a token is already held by another profile:

  • Do not kill the entire gateway
  • Only disable the conflicting platform and log a warning: platform 'weixin' disabled: token already in use by profile 'profile-a'
  • Other platforms and model functionality start normally

This way, even without --exclude-platform-credentials, a single platform conflict does not render the entire profile unusable.

Current web-ui Workaround

hermes-web-ui has implemented a server-side "smart clone": after calling hermes profile create <name> --clone, it immediately strips exclusive credentials from the new profile's .env, disables corresponding platform nodes in config.yaml (with backups). See hermes-web-ui PR #283.

However, this is an upper-layer patch — the CLI's default behavior is unchanged, so command-line users and other integrations (e.g., custom scripts) will still hit this issue. A fix at the hermes-agent level would be ideal.

Related

  • web-ui implementation: packages/server/src/services/hermes/profile-credentials.ts
  • Exclusive platform list (source: grep -l _acquire_platform_lock gateway/platforms/*.py): telegram, discord, slack, whatsapp, signal, weixin, feishu

Cross-references in this repo

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existsarea/configConfig system, migrations, profilescomp/cliCLI entry point, hermes_cli/, setup wizardcomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions