Skip to content

gut(media): remove media-understanding and link-understanding pipelines #425

@alexey-pelykh

Description

@alexey-pelykh

Context

Part of #415. Both applyMediaUnderstanding() and applyLinkUnderstanding() pipelines must be gutted.

Architectural decision: Middleware routes media and links to CLI runtimes — it does NOT process them itself. Text extraction, image vision, video vision, and URL content fetching are the CLI agent's job, not ours. The multimodal contract (#385) and runtime implementations (#397, #396) ensure the path from channel → CLI exists. STT/TTS is the only exception because it's channel-level infrastructure.

Blast radius

src/media-understanding/ — 49 files, ~5.5k lines

Consumers to decouple/remove:

Consumer Imports Action
src/auto-reply/reply/get-reply.ts applyMediaUnderstanding Remove call
src/auto-reply/templating.ts Types Remove media-understanding types
src/auto-reply/status.ts MediaUnderstandingDecision Remove media decision display
src/auto-reply/reply/commands-status.ts MediaUnderstandingDecision Remove
src/auto-reply/reply/commands-info.ts Media understanding refs Remove
src/auto-reply/reply/get-reply-inline-actions.ts Media understanding refs Remove
src/auto-reply/reply/get-reply-directives-apply.ts Media understanding refs Remove
src/auto-reply/media-note.ts + test Media decision types Remove or simplify
src/discord/voice/manager.ts runCapability from runner Switch to src/stt/ directly
src/stt/preflight.ts isAudioAttachment, runCapability, types Relocate needed utilities into src/stt/
src/stt/providers/shared.ts Shared provider utilities Relocate into src/stt/providers/
src/stt/providers/google/audio.ts generateGeminiInlineDataText Relocate into src/stt/
src/stt/providers/audio.test-helpers.ts Test helpers Relocate into src/stt/

Config schema to remove: MediaUnderstandingScopeSchema, MediaUnderstandingCapabilitiesSchema, MediaUnderstandingAttachmentsSchema, MediaUnderstandingModelSchema, ToolsMediaUnderstandingSchema in src/config/zod-schema.core.ts. Types in src/config/types.tools.ts.

src/link-understanding/ — 6 files, ~333 lines

Consumers to remove:

Consumer Imports Action
src/auto-reply/reply/get-reply.ts applyLinkUnderstanding Remove call
src/auto-reply/reply/get-reply.reset-hooks-fallback.test.ts Mock Remove mock
src/auto-reply/templating.ts LinkUnderstanding context field Remove field

Config schema to remove: LinkToolsConfig, LinkModelConfig types, tools.links config field, schema labels/help entries.

Note: link-understanding imports shared utilities from media-understanding (CLI_OUTPUT_MAX_BUFFER, resolveTimeoutMs, resolveMediaUnderstandingScope) — both modules die together, no relocation needed for these.

Work order

  1. Relocate STT shared utilities — move what src/stt/ actually needs out of media-understanding/ into src/stt/
  2. Switch discord/voice — use src/stt/ directly instead of media-understanding runner
  3. Remove both pipeline calls from get-reply.ts (applyMediaUnderstanding, applyLinkUnderstanding)
  4. Remove auto-reply consumers — media decision types, status display, media note, link understanding context field
  5. Remove config schematools.media (image/video/audio vision config), tools.links
  6. Delete src/media-understanding/ and src/link-understanding/ entirely
  7. Verify — all tests pass, STT still works, voice messages work, no orphan imports

Depends on

Does NOT depend on

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions