Skip to content

[AgentProtocol] Implement tool-controlled display for useAgentSession #25046

@mbleigh

Description

@mbleigh

Objective: Provide tools direct control over their visual representation by adding support for "display" properties within the new AgentProtocol / useAgentSession interfaces. Do this without altering existing behavior for useGeminiStream since useAgentSession is experimental and still in active development.

At a high level, we will first refactor the tool execution interface to better support tool-controlled display, then add primitives for tool controlled display to the events in AgentProtocol, then wire up the UI to consume the new tool-controlled display if provided, then finally migrate tools to use the new tool-controlled display ONLY when useAgentSession is enabled.

Implementation Plan

To ensure visual parity with the legacy UI while standardizing the Agent Protocol, we will follow an incremental, commit-per-step approach. Each step MUST be a separate commit with all tests passing.

1. Atomic Signature Refactor: ExecuteOptions

  • Define ExecuteOptions in packages/core/src/tools/tools.ts and refactor ToolInvocation.execute() to accept this object instead of positional arguments.
  • Update all tool implementations and mock invocations in tests to match the new signature.
  • Goal: Zero behavioral change; just a cleaner interface for future expansion.
  • Commit: refactor(core): use ExecuteOptions for tool execution

2. Unified Display Data Model

  • Define the simplified ToolDisplay interface in packages/core/src/agent/types.ts:
    export interface ToolDisplay {
      name: string;             // "ReadFile"
      description?: string;      // "src/index.ts" (Gray text)
      resultSummary?: string;    // "Read 150 lines" (Purple text + arrow)
      result?: DisplayContent | DisplayDiff | ContentPart[]; // Rich payload
    }
  • Update ToolRequest, ToolUpdate, and ToolResponse events to include a nested display?: ToolDisplay property.
  • Commit: feat(agent): define unified ToolDisplay protocol property

3. Centralized Display Utility Logic

  • Create packages/core/src/agent/tool-display-utils.ts to manage the population of the display property.
  • Refactor event-translator.ts and legacy-agent-session.ts to use these utilities for populating metadata from tool requests and results.
  • Use invocation.getDescription() as the default fallback for display.description.
  • Commit: feat(core): centralize tool display population logic

4. Simplified UI Consumption (Dumb Terminal)

  • Update IndividualToolCallDisplay in packages/cli/src/ui/types.ts to include display?: ToolDisplay.
  • Refactor useAgentStream.ts to directly merge event.display into the state.
  • Update ToolMessage, ShellToolMessage, and DenseToolMessage to render based on the display object fields:
    • display.name (Bold name)
    • display.description (Gray text, no arrow)
    • display.resultSummary (Purple text, with →)
  • Commit: refactor(cli): consume simplified ToolDisplay property

5. Incremental Tool Migration

  • Migrate core tools one by one, ensuring high-fidelity visual parity:
    • ReadFile/WriteFile (Identity/Description focus)
    • Shell (Execution state focus)
    • LS/Grep (ResultSummary/Stats focus)
  • Commit: feat(tools): migrate [ToolName] to structured visual metadata

Acceptance Criteria

  • useGeminiStream (legacy) rendering remains 100% identical.
  • useAgentStream rendering matches legacy visual styling exactly (gray filenames, purple arrows for counts).
  • No extra vertical space/margins appear for elided or background tool calls.
  • Each step above corresponds to a separate verified commit.

Metadata

Metadata

Assignees

Labels

area/agentIssues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions