Skip to content

Tool execution framework hard 90s timeout ignores _timeout_seconds parameter #1356

@petabridge-netclaw

Description

@petabridge-netclaw

Problem

The _timeout_seconds parameter on tool calls is silently ignored when the LLM doesn't explicitly include it in the tool call arguments. The tool execution framework has a hardcoded 90s cap that cannot be overridden, even though:

  1. The _timeout_seconds parameter is documented in every tool's schema
  2. ToolCallMeta.ExtractFrom() correctly extracts the value
  3. ToolCallMetaExtractor.ComputeEffectiveTimeout() correctly clamps it
  4. ShellTool.ExecuteAsync() correctly reads context.RequestedTimeoutSeconds

When _timeout_seconds IS passed: The pipeline overrides the timeout correctly (verified via code path tracing).

When _timeout_seconds is NOT passed: The 90s default from SessionConfig.ToolExecutionTimeout kicks in, but the LLM has no mechanism to inject this parameter into tool calls automatically. The LLM only includes _timeout_seconds when it explicitly decides to.

Root Cause

Two issues compound:

1. Session config default is too conservative

SessionConfig.cs line 58:

public TimeSpan ToolExecutionTimeout { get; init; } = TimeSpan.FromSeconds(90);

This is described as a "Per-tool-call inactivity budget" but 90s is too short for legitimate shell operations (builds, migrations, long-running commands). The comment says it's a "watchdog" for tool liveness, but 90s catches legitimate work that should be allowed to complete.

2. Pipeline doesn't respect tool-configured defaults

The pipeline passes timeout (the session config's 90s) as the floor for ComputeEffectiveTimeout(), but tools like ShellTool have their own ShellTimeoutSeconds default (60s). When the LLM doesn't pass _timeout_seconds, the pipeline uses the session config's 90s as both floor AND effective timeout — ignoring the tool's own configured default.

Evidence

Traced the full chain in netclaw-dev/netclaw:

LLM tool call (no _timeout_seconds)
  → ToolCallMeta.ExtractFrom() → meta = null
  → ComputeEffectiveTimeout() SKIPPED (meta is null)
  → context.RequestedTimeoutSeconds = 90 (from SessionConfig)
  → ShellTool reads 90, but child process timeout = max(60, 90) = 90
  → Framework kills at 90s regardless of what the LLM "needs"

Impact

  • Long-running shell commands (builds, migrations, backups) always get killed at 90s
  • No way to reliably use _timeout_seconds because the LLM doesn't know it's needed
  • The parameter exists in the schema but has no effect unless the LLM explicitly includes it

Proposed Fix

In SessionToolExecutionPipeline.ExecuteSingleToolAsync, when meta?.TimeoutHintSeconds is null:

  • Use Math.Max(shellTimeoutSeconds, (int)timeout.TotalSeconds) as the floor
  • This ensures tools like ShellTool (60s default) aren't silently capped at 90s by the session config
  • Consider increasing the session config default from 90s to a more generous value (e.g., 300s)

Fix Applied

A fix was applied in SessionToolExecutionPipeline.cs that ensures when _timeout_seconds is not passed, the pipeline falls back to the larger of shellTimeoutSeconds (60s) or the session config default — never below the tool's own configured floor.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions