You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RemoteClaw's middleware architecture uses CLI subprocesses to interact with AI agents. The abstract base class CLIRuntimeBase (in src/middleware/cli-runtime-base.ts) handles subprocess spawning, NDJSON parsing, watchdog timers, abort signal propagation, and stdin prompt delivery. Concrete runtimes extend it and implement three abstract methods.
This issue implements the Gemini CLI runtime — the second concrete runtime, targeting Google's gemini CLI from google-gemini/gemini-cli.
/** Construct CLI-specific command-line arguments. */protectedabstractbuildArgs(params: AgentExecuteParams): string[];/** Parse a single NDJSON line into an AgentEvent (or null to skip). */protectedabstractextractEvent(line: string): AgentEvent|null;/** Construct provider-specific environment variables. */protectedabstractbuildEnv(params: AgentExecuteParams): Record<string,string>;
Additionally, subclasses may override:
get supportsStdinPrompt(): boolean — whether the CLI accepts prompts via stdin (default: true)
execute() — to wrap the base execution with per-call setup/teardown
Dependencies
src/middleware/types.ts — AgentRuntime, AgentExecuteParams, AgentEvent, AgentRunResult, etc.
src/middleware/cli-runtime-base.ts — CLIRuntimeBase abstract class
Override to return false. The Gemini CLI does not support stdin prompt delivery — prompts must be passed via the -p flag.
buildArgs(params: AgentExecuteParams): string[]
Build the Gemini CLI argument list:
Flag
Value
When
--output-format
stream-json
Always — NDJSON streaming output
-p
params.prompt
Always — prompt delivery via flag
-r
params.sessionId
When params.sessionId is provided
Note on prompt delivery: Unlike Claude (positional arg), Gemini requires the -p flag for prompt delivery. Since supportsStdinPrompt is false, CLIRuntimeBase will not attempt stdin delivery regardless of prompt length. The -p flag should always be included.
Note on missing flags: The --verbose flag (used by Claude) is not applicable to Gemini. The stream-json output format already includes all available metadata in the result event.
execute() override — MCP config file lifecycle
The Gemini CLI reads MCP server configuration from a settings file (.gemini/settings.json in the working directory or ~/.gemini/settings.json globally). There is no --mcp-config CLI flag.
When params.mcpServers has entries, the runtime must manage a project-local settings file:
Implementation note: Check during implementation whether the Gemini CLI supports a --settings-dir or --config flag. If it does, a cleaner approach would be to create a temp directory with the settings file and point the flag there, avoiding any file collision concerns. The merge-restore pattern described above is the fallback.
extractEvent(line: string): AgentEvent | null
Parse a single NDJSON line from Gemini's stream-json output into an AgentEvent.
Gemini stream-json format (verified from google-gemini/gemini-cli source: packages/core/src/output/types.ts, stream-json-formatter.ts):
Each NDJSON line is bare JSON (no envelope) with a type discriminator and timestamp base field. There are 6 output event types.
Extract stats for usage data; do not emit directly
Stateful fields (instance-level, reset per execute() call):
currentSessionId: string | undefined — from init event's session_id field
accumulatedText: string — concatenated text from message deltas for AgentRunResult.text
resultStats: GeminiResultStats | undefined — from result event's stats field
Session ID tracking: The init event contains a session_id field. Capture it into instance state. Include in AgentRunResult via done event enrichment.
Usage extraction (from result event's stats field):
// result.stats structure:{
total_tokens: number;
input_tokens: number;
output_tokens: number;
cached: number;// cache read tokens
duration_ms: number;// API duration
tool_calls: number;// number of tool invocations (analogous to "turns")}
Map to AgentUsage:
inputTokens ← stats.input_tokens
outputTokens ← stats.output_tokens
cacheReadTokens ← stats.cached (when > 0)
Additional result metadata:
apiDurationMs ← stats.duration_ms
numTurns ← stats.tool_calls
Note on tool_result events: The Gemini CLI source defines a tool_result output type, but channel adapter behavior with this event type should be tested during integration. The formatter maps internal events to output types; actual emission of tool_result may vary depending on tool execution patterns.
Return environment variable overrides for the Gemini subprocess. Currently returns an empty record {}.
Auth credentials (GEMINI_API_KEY) are passed through params.env by the caller, not hardcoded in the runtime. The runtime should not assume any particular auth mechanism.
Done event enrichment
Same pattern as ClaudeCliRuntime: override execute() to intercept the done event from CLIRuntimeBase and enrich AgentRunResult with accumulated state.
Result metadata mapping (from accumulated state → AgentRunResult):
text ← accumulated from all message delta events
sessionId ← from init event's session_id
usage ← from result.stats (see usage extraction above)
apiDurationMs ← from result.stats.duration_ms
numTurns ← from result.stats.tool_calls
File: src/middleware/runtimes/gemini.test.ts
Unit tests following the same pattern as claude.test.ts (testable subclass exposing protected methods).
Internal GeminiEventType has 18 values but the formatter maps/aggregates to the 6 output types — thought, citation, retry, loop_detected etc. are NOT emitted as separate output events.
Empirical verification needed: Exact field names on message events (e.g., content vs text) and tool_result emission patterns require capture of actual gemini --output-format stream-json output during implementation.
Context
RemoteClaw's middleware architecture uses CLI subprocesses to interact with AI agents. The abstract base class
CLIRuntimeBase(insrc/middleware/cli-runtime-base.ts) handles subprocess spawning, NDJSON parsing, watchdog timers, abort signal propagation, and stdin prompt delivery. Concrete runtimes extend it and implement three abstract methods.This issue implements the Gemini CLI runtime — the second concrete runtime, targeting Google's
geminiCLI fromgoogle-gemini/gemini-cli.Architecture
CLIRuntimeBaserequires subclasses to implement:Additionally, subclasses may override:
get supportsStdinPrompt(): boolean— whether the CLI accepts prompts via stdin (default:true)execute()— to wrap the base execution with per-call setup/teardownDependencies
src/middleware/types.ts—AgentRuntime,AgentExecuteParams,AgentEvent,AgentRunResult, etc.src/middleware/cli-runtime-base.ts—CLIRuntimeBaseabstract classsrc/middleware/runtimes/claude.ts— reference implementation (same pattern)All exist on
main.Specification
File:
src/middleware/runtimes/gemini.tsCreate
GeminiCliRuntimeextendingCLIRuntimeBase.Constructor
get supportsStdinPrompt(): booleanOverride to return
false. The Gemini CLI does not support stdin prompt delivery — prompts must be passed via the-pflag.buildArgs(params: AgentExecuteParams): string[]Build the Gemini CLI argument list:
--output-formatstream-json-pparams.prompt-rparams.sessionIdparams.sessionIdis providedNote on prompt delivery: Unlike Claude (positional arg), Gemini requires the
-pflag for prompt delivery. SincesupportsStdinPromptisfalse,CLIRuntimeBasewill not attempt stdin delivery regardless of prompt length. The-pflag should always be included.Note on missing flags: The
--verboseflag (used by Claude) is not applicable to Gemini. Thestream-jsonoutput format already includes all available metadata in theresultevent.execute()override — MCP config file lifecycleThe Gemini CLI reads MCP server configuration from a settings file (
.gemini/settings.jsonin the working directory or~/.gemini/settings.jsonglobally). There is no--mcp-configCLI flag.When
params.mcpServershas entries, the runtime must manage a project-local settings file:MCP config file management (internal helper class or methods):
Setup:
.gemini/settings.jsonexists inparams.workingDirectorymcpServerskey into it, write back.gemini/directory if needed, write{ "mcpServers": {...} }Teardown (in
finallyblock — always runs):.gemini/directory was created (was empty): rmdir itGemini settings.json MCP format:
{ "mcpServers": { "<server-name>": { "command": "<command>", "args": ["<arg1>", "<arg2>"], "env": { "<KEY>": "<VALUE>" } } } }Implementation note: Check during implementation whether the Gemini CLI supports a
--settings-diror--configflag. If it does, a cleaner approach would be to create a temp directory with the settings file and point the flag there, avoiding any file collision concerns. The merge-restore pattern described above is the fallback.extractEvent(line: string): AgentEvent | nullParse a single NDJSON line from Gemini's
stream-jsonoutput into anAgentEvent.Gemini
stream-jsonformat (verified fromgoogle-gemini/gemini-clisource:packages/core/src/output/types.ts,stream-json-formatter.ts):Each NDJSON line is bare JSON (no envelope) with a
typediscriminator andtimestampbase field. There are 6 output event types.Event mapping:
typeinitsession_idascurrentSessionIdmessagedelta === trueANDrole === "assistant"AgentTextEvent{ type: "text", text: content }messagedelta === falseORrole !== "assistant"tool_useAgentToolUseEvent{ type: "tool_use", toolName: tool_name, toolId: tool_id, input: parameters }tool_resultAgentToolResultEvent{ type: "tool_result", toolId: tool_id, output: output, isError: status === "error" }errorAgentErrorEvent{ type: "error", message: message, code: severity }resultstatsfor usage data; do not emit directlyStateful fields (instance-level, reset per
execute()call):currentSessionId: string | undefined— frominitevent'ssession_idfieldaccumulatedText: string— concatenated text frommessagedeltas forAgentRunResult.textresultStats: GeminiResultStats | undefined— fromresultevent'sstatsfieldSession ID tracking: The
initevent contains asession_idfield. Capture it into instance state. Include inAgentRunResultvia done event enrichment.Usage extraction (from
resultevent'sstatsfield):Map to
AgentUsage:inputTokens←stats.input_tokensoutputTokens←stats.output_tokenscacheReadTokens←stats.cached(when > 0)Additional result metadata:
apiDurationMs←stats.duration_msnumTurns←stats.tool_callsbuildEnv(params: AgentExecuteParams): Record<string, string>Return environment variable overrides for the Gemini subprocess. Currently returns an empty record
{}.Auth credentials (
GEMINI_API_KEY) are passed throughparams.envby the caller, not hardcoded in the runtime. The runtime should not assume any particular auth mechanism.Done event enrichment
Same pattern as
ClaudeCliRuntime: overrideexecute()to intercept thedoneevent fromCLIRuntimeBaseand enrichAgentRunResultwith accumulated state.Result metadata mapping (from accumulated state →
AgentRunResult):text← accumulated from allmessagedelta eventssessionId← frominitevent'ssession_idusage← fromresult.stats(see usage extraction above)apiDurationMs← fromresult.stats.duration_msnumTurns← fromresult.stats.tool_callsFile:
src/middleware/runtimes/gemini.test.tsUnit tests following the same pattern as
claude.test.ts(testable subclass exposing protected methods).Argument construction (5+ test cases):
--output-format stream-json -p <prompt>-r <session-id>-rflag-pflag (not positional)Event extraction (10+ test cases):
init→ skip (but session ID captured)messagewithdelta: true,role: "assistant"→AgentTextEventmessagewithdelta: false→ skip (final message)messagewithrole: "user"→ skiptool_use→AgentToolUseEventwithtoolName,toolId,inputtool_resultwithstatus: "success"→AgentToolResultEventwithisError: falsetool_resultwithstatus: "error"→AgentToolResultEventwithisError: trueerror→AgentErrorEventwithmessageandcodefromseverityresult→ stores stats, returns nullnull)Environment construction (2+ test cases):
MCP config file management (4+ test cases):
mcpServershas entries (mock filesystem or check args)mcpServersis empty/undefined{ "mcpServers": {...} })supportsStdinPrompt(1 test case):falseDone event enrichment (3+ test cases):
duration_mstoapiDurationMs,tool_callstonumTurnsAcceptance Criteria
src/middleware/runtimes/gemini.tsexists and exportsGeminiCliRuntimeCLIRuntimeBaseand implements all three abstract methodssupportsStdinPromptreturnsfalsebuildArgs()produces correct Gemini CLI flags (--output-format stream-json -p <prompt>)buildArgs()adds-r <session-id>whensessionIdis providedextractEvent()correctly maps all 6 Gemini event types toAgentEventtypesextractEvent()is stateful: accumulates text, tracks session ID frominit, stores result statsinitevent and included in the done resultresult.statsfield.gemini/settings.jsonin the working directory whenmcpServershas entries.gemini/settings.jsonis preserved (merge-restore pattern)pnpm buildpassespnpm testpassesReference
src/middleware/runtimes/claude.ts— reference implementation following the same patternsrc/middleware/runtimes/claude.test.ts— reference test file with testable subclass patterngoogle-gemini/gemini-clisource:packages/core/src/output/types.ts— output event type definitionspackages/core/src/output/stream-json-formatter.ts— NDJSON formatter mapping internal events to output typessession_idadded in PR feat(agents): detect and recover from consecutive empty model responses (0 output tokens) openclaw/openclaw#14504 (Dec 2025). Tokencached/inputbreakdown added PR fix: session file locks not released after write (#15000) openclaw/openclaw#15021.GeminiEventTypehas 18 values but the formatter maps/aggregates to the 6 output types —thought,citation,retry,loop_detectedetc. are NOT emitted as separate output events.messageevents (e.g.,contentvstext) andtool_resultemission patterns require capture of actualgemini --output-format stream-jsonoutput during implementation.