Skip to content

research(auth+media): auth profiles and media understanding architecture #376

@alexey-pelykh

Description

@alexey-pelykh

Context

The auth profiles system (auth-profiles.json, provider-auth.ts, onboarding credential collection) and media understanding subsystem were inherited from OpenClaw's model-provider architecture. RemoteClaw is middleware — it spawns CLI agents, it doesn't make model API calls itself.

With runtimeEnv (#375) handling CLI credential injection via config, and CLIs handling multimodal input natively, both systems are redundant.

Prerequisite spike

Before gutting media understanding, verify that CLI agents can receive media (images, audio, video) via their NDJSON/streaming-JSON interfaces:

  • Claude CLI: Can images be passed via --print streaming JSON? Base64? File paths?
  • Gemini CLI: Multimodal input via --prompt or stdin?
  • Codex CLI: --image flag works with JSON output mode? Multiple images?
  • Audio/video: Do any CLIs accept audio or video input? What happens when a user sends a voice message on WhatsApp?

The spike determines whether media passthrough is straightforward or needs a thin adapter layer (e.g., saving media to temp files and passing paths).

What to remove

Media understanding subsystem

  • src/media-understanding/ — entire directory (runner, entries, providers, attachments, types, resolve, defaults, errors, output-extract)
  • Media understanding config in schema (tools.mediaUnderstanding)
  • Related test files

Auth profile infrastructure

  • src/agents/auth-profiles/ — entire directory (store, types, constants, profiles, oauth, paths, display, doctor, session-override)
  • src/agents/auth-profiles.ts — barrel re-export
  • src/agents/auth-health.ts — auth health summary
  • src/agents/provider-auth.ts — provider auth resolution (resolveApiKeyForProvider, resolveEnvApiKey, resolveModelAuthMode, resolveModelAuthLabel, etc.)
  • src/agents/api-key-rotation.ts — API key rotation (only used by media understanding)

Onboarding credential collection

  • src/commands/onboard-auth.credentials.tswriteOAuthCredentials, setAnthropicApiKey, setGeminiApiKey, etc.
  • src/wizard/onboarding.tspromptRuntimeCredential() flow (API key / auth token prompts)
  • src/commands/onboard-auth.config-*.ts — provider-specific config helpers
  • src/commands/configure.gateway-auth.ts — gateway auth profile upserts

Auth profile consumers

  • src/commands/doctor-auth.ts — deprecated profile detection + health display
  • src/commands/agent.tsensureAuthProfileStore(), clearSessionAuthProfileOverride()
  • src/auto-reply/reply/directive-handling.auth.ts/auth directive handler
  • src/auto-reply/reply/commands-status.tsresolveModelAuthLabel() display
  • src/auto-reply/status.tsresolveModelAuthMode() status display
  • src/auto-reply/reply/agent-runner.ts — auth mode checks
  • src/auto-reply/reply/get-reply-run.ts — session auth profile override
  • src/cron/isolated-agent/run.ts — session auth profile override
  • src/infra/provider-usage.auth.ts — provider usage auth resolution
  • src/commands/agents.commands.add.tsupsertAuthProfile() during agent add
  • src/commands/channels/list.tsloadAuthProfileStore()
  • src/plugins/types.tsAuthProfileCredential type import

Config/schema fields

  • auth.profiles in config schema
  • Session entry fields: authProfileOverride, providerOverride, modelOverride

What replaces it

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions