Skip to content

Feature Request: Add Hook for Tool Execution Failures (OnToolError / ToolUseFailure) #4831

@coygeek

Description

@coygeek

Title: Feature Request: Add Hook for Tool Execution Failures (OnToolError / ToolUseFailure)

Labels: feature-request, hooks, enhancement

Is your feature request related to a problem? Please describe.

The current hooks system in Claude Code provides powerful PreToolUse and PostToolUse events. These are excellent for validating inputs before a tool runs and for post-processing after a tool succeeds.

However, there is a significant gap in the tool lifecycle: there is no dedicated hook that triggers specifically when a tool call fails. According to the Hooks Reference documentation, PostToolUse only runs "immediately after a tool completes successfully." This prevents developers from implementing deterministic, programmatic error handling, monitoring, and automated recovery logic.

Currently, if a tool like Bash fails (e.g., a wtest script exits with a non-zero code or an npm install command fails due to a network error), the raw stderr is passed back to Claude. While Claude can often interpret this, it is a non-deterministic process. Relying on the model to parse raw error logs is less reliable and customizable than a dedicated hook would be for programmatic error handling and remediation.

Describe the solution you'd like

We propose the introduction of a new hook event, tentatively named OnToolError (or ToolUseFailure), to provide a structured mechanism for reacting to tool execution errors. This hook would trigger whenever a tool call fails, enabling robust runtime error handling, custom logging, and automated remediation workflows.

This new hook should align with the existing hook architecture and follow the established patterns for input, output, and decision control.


Proposed OnToolError Hook Specification

  • Trigger: The hook would fire whenever a tool call fails for any reason, such as:

    • A Bash command returns a non-zero exit code.
    • A WebFetch or mcp__ tool encounters a network error or non-2xx HTTP status.
    • A tool execution times out.
    • An unhandled exception occurs during a tool's internal execution.
  • Matcher: It would use the same powerful regex and string matching as PreToolUse and PostToolUse to target specific tools (e.g., Bash, Write, mcp__*, etc.) or all tools using *.

  • Hook Input (stdin): The hook script would receive a JSON payload via stdin containing detailed context about the failure. This payload is designed to be extensible for various error types.

    {
      "session_id": "abc123",
      "transcript_path": "/path/to/transcript.jsonl",
      "cwd": "/path/to/project",
      "hook_event_name": "OnToolError",
      "tool_name": "Bash",
      "tool_input": { "command": "npm run test:unit" },
      "error": {
        "type": "ProcessError" | "NetworkError" | "TimeoutError" | "ToolInternalError",
        "message": "Command exited with a non-zero code.",
        "exit_code": 1,
        "http_status_code": null,
        "stdout": "...",
        "stderr": "Error: 1 test failed.",
        "details": {}
      }
    }
  • Hook Output (Decision Control): The hook would use the established exit code and advanced JSON output mechanisms to control the agent's execution flow.

    • Exit Code 0 (Log & Continue): The hook's stdout would be logged for debugging in transcript mode, but the agent would continue its normal error handling process (i.e., passing the raw error back to Claude). This is useful for passive logging, monitoring, or external reporting without interfering with the agent's flow.
    • Exit Code 2 (Block & Inform Claude): This would block the default error flow. The hook's stderr would be fed back to Claude as a curated, intelligent error message. This allows the hook to parse a complex error and provide a concise, actionable summary for the model to attempt a fix.
    • Advanced JSON Output: For more granular control, the hook could return a JSON object on stdout.
      {
        "decision": "block",
        "reason": "The unit tests failed because the 'react-dom' dependency is missing. You should try running `npm install react-dom` to fix this."
      }
      When decision is "block", the default error flow is prevented. The reason string is then fed to Claude as a curated, high-level instruction, which is more reliable than asking the model to parse raw stderr. This allows the hook to function as an intelligent error pre-processor for the agent.

Describe alternatives you've considered

  1. Using PostToolUse: This is not a viable alternative, as the documentation explicitly states it only runs on successful tool completion. It cannot be used to catch failures.
  2. Using PreToolUse: This is also not viable, as it's impossible to predict most runtime failures (e.g., network errors, timeouts, command-specific errors) before the tool is executed.
  3. Relying on Claude to parse raw stderr: This is the current behavior. While often effective, it can be unreliable for complex errors or when specific, structured information needs to be extracted from failure logs. An OnToolError hook allows for deterministic parsing and feedback generation, significantly improving the reliability and resilience of agentic loops.

Use Cases & Additional Context

This feature would unlock several powerful, enterprise-grade workflows:

  • Automated Test-Fix Loops: An OnToolError hook on Bash(npm test) could parse test failure logs, identify the exact failing test and error, and feed a concise summary back to Claude with a directive to fix that specific issue.
  • Intelligent Error Remediation: A hook could analyze a complex stack trace, identify the root cause (e.g., "missing dependency" vs. "syntax error" vs. "OOM error"), and provide a much more direct and actionable reason for Claude to act upon.
  • Custom Retry Logic: If an npm install command fails due to a known corporate proxy issue, a hook could inspect stderr and instruct Claude to retry after setting the appropriate proxy environment variables.
  • Automated Failure Reporting & Alerting: If a critical tool (like a deployment script) fails, a hook could capture the full error context and automatically create a detailed ticket in Jira/Linear or send a structured notification to a Slack channel or a monitoring service like Datadog.
  • Graceful Fallbacks for MCP Tools: If an mcp__ tool fails due to a transient network error, an OnToolError hook could implement retry-with-backoff logic or suggest an alternative tool to Claude.

Example OnToolError Hook Script (Python)

This script, registered for Bash tool failures, parses npm test output and provides a targeted suggestion to Claude.

#!/usr/bin/env python3
import json, sys, re

hook_input = json.load(sys.stdin)
error_data = hook_input.get("error", {})
stderr = error_data.get("stderr", "")

# Check if it's a specific type of test failure
missing_dependency_match = re.search(r"Cannot find module '(.+?)'", stderr)

if missing_dependency_match:
    dependency = missing_dependency_match.group(1)
    # Create a structured JSON response to guide Claude
    response = {
        "decision": "block",
        "reason": f"The tests failed due to a missing dependency: '{dependency}'. Suggest running `npm install {dependency}` to fix it."
    }
    print(json.dumps(response))
    # Exit 0 because we are providing structured output which takes precedence
    sys.exit(0)

# For any other error, fall back to default behavior (log and continue)
sys.exit(0)

Security Considerations

The OnToolError hook introduces a new vector for data handling. The stdout and stderr from failed commands could contain sensitive information (e.g., stack traces with internal paths, failed API requests with partial keys). Hook scripts that send this data to external services (like logging or ticketing systems) must sanitize the output to prevent accidental data leakage.

By following the existing design patterns of the hooks system, the OnToolError hook would be a natural and powerful extension for developers looking to build more robust and resilient automations with Claude Code.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions