Skip to content

fix: correct delegation tool trace error detection#13898

Closed
Y0914 wants to merge 1 commit into
NousResearch:mainfrom
Y0914:fix/delegation-tool-trace-error-detection
Closed

fix: correct delegation tool trace error detection#13898
Y0914 wants to merge 1 commit into
NousResearch:mainfrom
Y0914:fix/delegation-tool-trace-error-detection

Conversation

@Y0914

@Y0914 Y0914 commented Apr 22, 2026

Copy link
Copy Markdown

Bug Description

Delegation observability was misclassifying successful tool calls as errors in tool_trace.

Root Cause

tools/delegate_tool.py used a naive string heuristic that marked a tool result as an error whenever the returned content contained the word error near the start. That incorrectly flagged successful terminal JSON payloads like {\"error\": null, \"exit_code\": 0, ...}.

Fix

  • parse tool result JSON before classifying status
  • treat only real error states / non-empty errors / failed success flags / non-zero exit codes as errors
  • keep a conservative text fallback for non-JSON tool payloads
  • add regression tests covering error: null success payloads and non-zero exit codes

How to Verify

  1. Run source venv/bin/activate && python -m pytest tests/tools/test_delegate.py -q
  2. Call delegate_task with toolsets=[\"terminal\"] for a read-only command like hostname && pwd
  3. Confirm the returned tool_trace[0].status is ok instead of error

Test Plan

  • Added regression test for this bug
  • Existing tests still pass
  • Manual verification of the fix

Risk Assessment

Low — the change is narrowly scoped to delegation observability/status classification and is covered by targeted regression tests.

@Y0914

Y0914 commented Apr 22, 2026

Copy link
Copy Markdown
Author

Root cause summary:

  • delegate_tool.py was classifying tool-trace status with a naive string check.
  • If the tool payload contained the word error near the front, the trace entry was marked as error.
  • That broke for successful terminal JSON payloads such as {"output":"...","exit_code":0,"error":null}, because the literal key "error" was present even though the call succeeded.

What this PR changes:

  • Parse JSON tool payloads before classifying status.
  • Mark a trace entry as error only for real failure signals:
    • status in {error, failed, blocked, disabled}
    • non-empty error
    • success == false
    • non-zero exit_code
  • Keep a conservative text fallback for non-JSON payloads.

How I verified it:

  1. source venv/bin/activate && python -m pytest tests/tools/test_delegate.py -q
  2. Manual delegation repro with a read-only terminal subagent (hostname && pwd)
  3. Confirmed tool_trace[0].status is now ok after restart instead of the previous false error

@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have tool/delegate Subagent delegation labels Apr 22, 2026
@Y0914 Y0914 closed this by deleting the head repository Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have tool/delegate Subagent delegation type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants