Context
RemoteClaw's middleware architecture uses CLI subprocesses to interact with AI agents. The abstract base class CLIRuntimeBase (in src/middleware/cli-runtime-base.ts) handles subprocess spawning, NDJSON parsing, watchdog timers, abort signal propagation, and stdin prompt delivery. Concrete runtimes extend it and implement three abstract methods.
This issue implements the Claude CLI runtime — the first and primary concrete runtime.
Architecture
AgentRuntime (interface, src/middleware/types.ts)
└── CLIRuntimeBase (abstract, src/middleware/cli-runtime-base.ts)
└── ClaudeCliRuntime ← THIS ISSUE
CLIRuntimeBase requires subclasses to implement:
/** Construct CLI-specific command-line arguments. */
protected abstract buildArgs(params: AgentExecuteParams): string[];
/** Parse a single NDJSON line into an AgentEvent (or null to skip). */
protected abstract extractEvent(line: string): AgentEvent | null;
/** Construct provider-specific environment variables. */
protected abstract buildEnv(params: AgentExecuteParams): Record<string, string>;
Dependencies
src/middleware/types.ts — AgentRuntime, AgentExecuteParams, AgentEvent, AgentRunResult, etc.
src/middleware/cli-runtime-base.ts — CLIRuntimeBase abstract class
Both exist on main.
Specification
File: src/middleware/runtimes/claude.ts
Create ClaudeCliRuntime extending CLIRuntimeBase.
Constructor
constructor() {
super("claude"); // CLI binary name
}
buildArgs(params: AgentExecuteParams): string[]
Build the Claude CLI argument list:
| Flag |
Value |
When |
-p |
(none) |
Always — enables pipe/print mode |
--output-format |
stream-json |
Always — NDJSON streaming output |
--verbose |
(none) |
Always — enables usage/cost reporting in output |
--resume |
params.sessionId |
When params.sessionId is provided |
--mcp-config |
<JSON string> |
When params.mcpServers has entries |
| (positional) |
params.prompt |
When prompt length ≤ 10,000 chars (stdin threshold is handled by CLIRuntimeBase) |
MCP config format (Claude JSON format, passed as inline string):
{
"mcpServers": {
"<server-name>": {
"command": "<command>",
"args": ["<arg1>", "<arg2>"],
"env": { "<KEY>": "<VALUE>" }
}
}
}
The --mcp-config flag accepts JSON strings directly (no temp file needed). Pass the serialized JSON as a CLI argument: --mcp-config '{"mcpServers": {...}}'. This eliminates temp file lifecycle management entirely.
Important: The prompt is passed as a positional argument (last arg) only when it fits within CLI argument limits. CLIRuntimeBase handles the >10KB stdin delivery case — buildArgs() should always include the prompt as the final positional argument; the base class writes to stdin in addition when the threshold is exceeded.
extractEvent(line: string): AgentEvent | null
Parse a single NDJSON line from Claude's stream-json output into an AgentEvent.
Claude stream-json format (headless docs, SDK streaming docs):
Each NDJSON line is one of:
- A
stream_event envelope wrapping a standard Claude API streaming event
- A final line (
result) emitted after all streaming events
stream_event envelope structure:
{
"type": "stream_event",
"uuid": "<UUID>",
"session_id": "<session-id>",
"parent_tool_use_id": "<tool-use-id> | null",
"event": { /* Standard Claude API RawMessageStreamEvent */ }
}
Event mapping (stream_event lines — where line.type === "stream_event"):
Inner event.type |
Condition |
Maps To |
Notes |
message_start |
— |
Skip |
Extract session_id from envelope |
content_block_start |
content_block.type === "text" |
Skip |
Text content arrives via deltas |
content_block_start |
content_block.type === "tool_use" |
Buffer |
Store name, id from content_block; init input accumulator |
content_block_delta |
delta.type === "text_delta" |
AgentTextEvent |
{ type: "text", text: delta.text } |
content_block_delta |
delta.type === "input_json_delta" |
Accumulate |
Append delta.partial_json to buffered tool input |
content_block_delta |
delta.type === "thinking_delta" |
Skip |
Extended thinking content, not user-facing |
content_block_stop |
Tool buffered |
AgentToolUseEvent |
Emit with JSON.parse(accumulated_input); clear buffer |
content_block_stop |
No tool buffered |
Skip |
End of text block |
message_delta |
— |
Skip |
Extract delta.stop_reason and usage for result metadata |
message_stop |
— |
Skip |
— |
ping |
— |
Skip |
Keepalive |
Event mapping (final result line — where line.type === "result"):
The result line is emitted after all stream_event lines. It contains cost, usage, session ID, and run metadata. Map to AgentDoneEvent with populated AgentRunResult.
Note on tool results: The Claude CLI handles tool execution internally. AgentToolResultEvent mapping requires empirical verification — the CLI may or may not emit observable events for tool results. If it does, they likely appear as part of the interleaved conversation flow (the next response's stream_event lines will have parent_tool_use_id set). Initial implementation may omit AgentToolResultEvent and add it when the exact format is confirmed.
Stateful fields (instance-level, reset per execute() call):
currentSessionId: string | undefined — from first stream_event.session_id envelope
accumulatedText: string — concatenated text deltas for AgentRunResult.text
toolBuffer: { name: string; id: string; input: string } | null — in-progress tool_use block
lastUsage: AgentUsage | undefined — from message_delta event's usage field
lastStopReason: string | undefined — from message_delta event's delta.stop_reason
Session ID tracking: Every stream_event envelope contains a session_id field. Capture it from the first event. Also available on the final result line. Store as instance state for inclusion in AgentRunResult.
Usage extraction (from message_delta inner event and/or result line):
// message_delta event contains usage in its top-level usage field:
// { output_tokens: number }
// The result line contains cumulative usage:
{
input_tokens: number;
output_tokens: number;
cache_read_input_tokens?: number;
cache_creation_input_tokens?: number;
}
Map to AgentUsage:
inputTokens ← input_tokens
outputTokens ← output_tokens
cacheReadTokens ← cache_read_input_tokens
cacheWriteTokens ← cache_creation_input_tokens
Result metadata mapping (from result line → AgentRunResult):
text ← accumulated from all text_delta events
sessionId ← from stream_event.session_id envelope or result.session_id
usage ← from result line usage fields
totalCostUsd ← result.cost_usd
apiDurationMs ← result.duration_api_ms
numTurns ← result.num_turns
stopReason ← from message_delta delta.stop_reason or result.subtype
Note on field names: The exact field names on the result line (e.g., cost_usd vs costUsd, duration_api_ms vs apiDurationMs) require empirical verification via claude -p --output-format stream-json output. The names above are best-guess based on SDK type definitions; implementation should adapt to actual output.
Note: CLIRuntimeBase.execute() currently synthesizes a minimal AgentDoneEvent with empty fields after the subprocess exits. extractEvent() should emit its own AgentDoneEvent from the result line before the stream closes, which will be yielded to consumers. The base class's synthetic done event will follow — consumers should handle receiving the richer one first. Alternatively, refactor CLIRuntimeBase to skip its synthetic done event if a subclass already emitted one (preferred if straightforward).
buildEnv(params: AgentExecuteParams): Record<string, string>
Return environment variable overrides for the Claude subprocess. Currently returns an empty record {}.
Auth credentials (ANTHROPIC_API_KEY, CLAUDE_CODE_OAUTH_TOKEN) are passed through params.env by the caller, not hardcoded in the runtime. The runtime should not assume any particular auth mechanism.
File: src/middleware/runtimes/claude.test.ts
Unit tests verifying:
-
Argument construction (6+ test cases):
- Basic invocation:
-p --output-format stream-json --verbose <prompt>
- Session resume: adds
--resume <session-id>
- MCP config: serializes JSON, adds
--mcp-config <json-string>
- Short prompt: included as positional arg
- All flags present: session + MCP + prompt combined
- No session, no MCP: minimal args
-
Event extraction (10+ test cases):
stream_event with message_start → skip (but session ID captured from envelope)
stream_event with content_block_delta / text_delta → AgentTextEvent
stream_event with content_block_start / tool_use → buffers tool name+id
stream_event with content_block_delta / input_json_delta → accumulates tool input
stream_event with content_block_stop (after tool_use) → AgentToolUseEvent with parsed input
stream_event with content_block_stop (after text) → skip
stream_event with message_delta → extracts stop_reason and usage
stream_event with thinking_delta → skip (returns null)
result line → AgentDoneEvent with full AgentRunResult (usage, cost, session, etc.)
ping → skip (returns null)
- Unknown event type → skip (returns
null)
- Malformed JSON → handled by
CLIRuntimeBase (skip at base level)
-
Environment construction (2+ test cases):
- Returns empty record (no hardcoded env vars)
- Does not inject auth vars (caller responsibility)
-
MCP config serialization (2+ test cases):
- JSON string is correctly serialized from
McpServerConfig entries
--mcp-config arg is omitted when no MCP servers configured
Acceptance Criteria
Reference
- The existing upstream CLI runner at
src/agents/cli-runner.ts and src/agents/cli-backends.ts shows how Claude is invoked today (using --output-format json, collecting full output). Our implementation uses stream-json for streaming NDJSON events instead, which is the key architectural difference.
- Claude Code headless mode docs — defines
stream-json output format and stream_event envelope
- Agent SDK streaming output docs — defines
SDKPartialAssistantMessage type (stream_event with RawMessageStreamEvent inner events)
- Known limitation: When
maxThinkingTokens is explicitly set (extended thinking mode), StreamEvent messages are not emitted — the SDK returns only the final AssistantMessage and ResultMessage.
- Empirical verification needed: Field-level names on the
result line and exact tool result event format require capture of actual claude -p --output-format stream-json output during implementation.
Context
RemoteClaw's middleware architecture uses CLI subprocesses to interact with AI agents. The abstract base class
CLIRuntimeBase(insrc/middleware/cli-runtime-base.ts) handles subprocess spawning, NDJSON parsing, watchdog timers, abort signal propagation, and stdin prompt delivery. Concrete runtimes extend it and implement three abstract methods.This issue implements the Claude CLI runtime — the first and primary concrete runtime.
Architecture
CLIRuntimeBaserequires subclasses to implement:Dependencies
src/middleware/types.ts—AgentRuntime,AgentExecuteParams,AgentEvent,AgentRunResult, etc.src/middleware/cli-runtime-base.ts—CLIRuntimeBaseabstract classBoth exist on
main.Specification
File:
src/middleware/runtimes/claude.tsCreate
ClaudeCliRuntimeextendingCLIRuntimeBase.Constructor
buildArgs(params: AgentExecuteParams): string[]Build the Claude CLI argument list:
-p--output-formatstream-json--verbose--resumeparams.sessionIdparams.sessionIdis provided--mcp-config<JSON string>params.mcpServershas entriesparams.promptCLIRuntimeBase)MCP config format (Claude JSON format, passed as inline string):
{ "mcpServers": { "<server-name>": { "command": "<command>", "args": ["<arg1>", "<arg2>"], "env": { "<KEY>": "<VALUE>" } } } }The
--mcp-configflag accepts JSON strings directly (no temp file needed). Pass the serialized JSON as a CLI argument:--mcp-config '{"mcpServers": {...}}'. This eliminates temp file lifecycle management entirely.Important: The prompt is passed as a positional argument (last arg) only when it fits within CLI argument limits.
CLIRuntimeBasehandles the >10KB stdin delivery case —buildArgs()should always include the prompt as the final positional argument; the base class writes to stdin in addition when the threshold is exceeded.extractEvent(line: string): AgentEvent | nullParse a single NDJSON line from Claude's
stream-jsonoutput into anAgentEvent.Claude
stream-jsonformat (headless docs, SDK streaming docs):Each NDJSON line is one of:
stream_eventenvelope wrapping a standard Claude API streaming eventresult) emitted after all streaming eventsstream_eventenvelope structure:{ "type": "stream_event", "uuid": "<UUID>", "session_id": "<session-id>", "parent_tool_use_id": "<tool-use-id> | null", "event": { /* Standard Claude API RawMessageStreamEvent */ } }Event mapping (
stream_eventlines — whereline.type === "stream_event"):event.typemessage_startsession_idfrom envelopecontent_block_startcontent_block.type === "text"content_block_startcontent_block.type === "tool_use"name,idfromcontent_block; init input accumulatorcontent_block_deltadelta.type === "text_delta"AgentTextEvent{ type: "text", text: delta.text }content_block_deltadelta.type === "input_json_delta"delta.partial_jsonto buffered tool inputcontent_block_deltadelta.type === "thinking_delta"content_block_stopAgentToolUseEventJSON.parse(accumulated_input); clear buffercontent_block_stopmessage_deltadelta.stop_reasonandusagefor result metadatamessage_stoppingEvent mapping (final
resultline — whereline.type === "result"):The
resultline is emitted after allstream_eventlines. It contains cost, usage, session ID, and run metadata. Map toAgentDoneEventwith populatedAgentRunResult.Stateful fields (instance-level, reset per
execute()call):currentSessionId: string | undefined— from firststream_event.session_idenvelopeaccumulatedText: string— concatenated text deltas forAgentRunResult.texttoolBuffer: { name: string; id: string; input: string } | null— in-progresstool_useblocklastUsage: AgentUsage | undefined— frommessage_deltaevent'susagefieldlastStopReason: string | undefined— frommessage_deltaevent'sdelta.stop_reasonSession ID tracking: Every
stream_eventenvelope contains asession_idfield. Capture it from the first event. Also available on the finalresultline. Store as instance state for inclusion inAgentRunResult.Usage extraction (from
message_deltainner event and/orresultline):Map to
AgentUsage:inputTokens←input_tokensoutputTokens←output_tokenscacheReadTokens←cache_read_input_tokenscacheWriteTokens←cache_creation_input_tokensResult metadata mapping (from
resultline →AgentRunResult):text← accumulated from alltext_deltaeventssessionId← fromstream_event.session_idenvelope orresult.session_idusage← fromresultline usage fieldstotalCostUsd←result.cost_usdapiDurationMs←result.duration_api_msnumTurns←result.num_turnsstopReason← frommessage_deltadelta.stop_reasonorresult.subtypeNote:
CLIRuntimeBase.execute()currently synthesizes a minimalAgentDoneEventwith empty fields after the subprocess exits.extractEvent()should emit its ownAgentDoneEventfrom theresultline before the stream closes, which will be yielded to consumers. The base class's synthetic done event will follow — consumers should handle receiving the richer one first. Alternatively, refactorCLIRuntimeBaseto skip its synthetic done event if a subclass already emitted one (preferred if straightforward).buildEnv(params: AgentExecuteParams): Record<string, string>Return environment variable overrides for the Claude subprocess. Currently returns an empty record
{}.Auth credentials (
ANTHROPIC_API_KEY,CLAUDE_CODE_OAUTH_TOKEN) are passed throughparams.envby the caller, not hardcoded in the runtime. The runtime should not assume any particular auth mechanism.File:
src/middleware/runtimes/claude.test.tsUnit tests verifying:
Argument construction (6+ test cases):
-p --output-format stream-json --verbose <prompt>--resume <session-id>--mcp-config <json-string>Event extraction (10+ test cases):
stream_eventwithmessage_start→ skip (but session ID captured from envelope)stream_eventwithcontent_block_delta/text_delta→AgentTextEventstream_eventwithcontent_block_start/tool_use→ buffers tool name+idstream_eventwithcontent_block_delta/input_json_delta→ accumulates tool inputstream_eventwithcontent_block_stop(after tool_use) →AgentToolUseEventwith parsed inputstream_eventwithcontent_block_stop(after text) → skipstream_eventwithmessage_delta→ extracts stop_reason and usagestream_eventwiththinking_delta→ skip (returnsnull)resultline →AgentDoneEventwith fullAgentRunResult(usage, cost, session, etc.)ping→ skip (returnsnull)null)CLIRuntimeBase(skip at base level)Environment construction (2+ test cases):
MCP config serialization (2+ test cases):
McpServerConfigentries--mcp-configarg is omitted when no MCP servers configuredAcceptance Criteria
src/middleware/runtimes/claude.tsexists and exportsClaudeCliRuntimeCLIRuntimeBaseand implements all three abstract methodsbuildArgs()produces correct Claude CLI flags for all parameter combinationsextractEvent()correctly maps Claudestream_eventenvelopes andresultline toAgentEventtypesextractEvent()is stateful: buffers tool_use blocks, accumulates text, tracks session IDstream_eventenvelope and included in the done resultresultevent--mcp-configpnpm buildpassespnpm testpassesReference
src/agents/cli-runner.tsandsrc/agents/cli-backends.tsshows how Claude is invoked today (using--output-format json, collecting full output). Our implementation usesstream-jsonfor streaming NDJSON events instead, which is the key architectural difference.stream-jsonoutput format andstream_eventenvelopeSDKPartialAssistantMessagetype (stream_eventwithRawMessageStreamEventinner events)maxThinkingTokensis explicitly set (extended thinking mode),StreamEventmessages are not emitted — the SDK returns only the finalAssistantMessageandResultMessage.resultline and exact tool result event format require capture of actualclaude -p --output-format stream-jsonoutput during implementation.