Skip to content

fix(delegate): tool_trace false-positive error detection for short outputs#26369

Closed
flooryyyy wants to merge 2 commits into
NousResearch:mainfrom
flooryyyy:fix-delegate-tool-trace-error-detection
Closed

fix(delegate): tool_trace false-positive error detection for short outputs#26369
flooryyyy wants to merge 2 commits into
NousResearch:mainfrom
flooryyyy:fix-delegate-tool-trace-error-detection

Conversation

@flooryyyy

@flooryyyy flooryyyy commented May 15, 2026

Copy link
Copy Markdown
Contributor

Problem

The delegate_tool tool_trace marked successful tool calls as "status": "error" when the output was short.

Root cause: the error detection heuristic checked if the substring "error" appeared in the first 80 chars of the tool result:

is_error = bool(content and "error" in content[:80].lower())

For short JSON outputs like:

{"output":"test1\n","exit_code":0,"error":null}

The "error" key (from "error":null) lands within the first 80 chars → false positive flagged as error.

This only affected the TUI overlay display and tool_trace diagnostics — the tool itself executed successfully. Long outputs avoided the bug because the "error" key was pushed past the 80-char window.

Fix

Replace the substring heuristic with a call to _looks_like_error_output() (already defined at line 272 in the same file), which:

  • Parses JSON and checks the error key is truthy (null / false / "" do not trigger)
  • Checks status field for error/failed/timeout
  • Checks first line for classic error markers (error:, traceback, exception:, failed:)

Testing

Verified with the existing _looks_like_error_output function:

  • {"output":"test1","exit_code":0,"error":null}False (correct)
  • {"output":"","exit_code":127,"error":"command not found"}True (correct)
  • Empty string → False (correct)

Old heuristic: flagged the first case as True (wrong).

flooryyyy added 2 commits May 15, 2026 13:32
On NixOS, systemd units don't inherit the user's PATH, so system
packages (ffmpeg, curl, kill) are invisible. This breaks STT (ffmpeg
for ogg conversion) and ExecReload (/bin/kill doesn't exist).

- Prepend /run/current-system/sw/{bin,sbin} to PATH in both user and
  system unit templates (only when /etc/NIXOS exists)
- Add ~/.nix-profile/{bin,sbin} to user-local path candidates
- Use /usr/bin/env kill in ExecReload instead of hardcoded /bin/kill
The old heuristic checked if substring 'error' appeared in the first 80
chars of tool output JSON. For short outputs like:
  {"output":"test1","exit_code":0,"error":null}
the '"error"' key lands within 80 chars → flagged as error.

Reuse _looks_like_error_output() which parses JSON properly so
"error": null / false don't trigger, while real errors with
non-null error values or Traceback lines still do.
@flooryyyy flooryyyy closed this May 15, 2026
@flooryyyy

flooryyyy commented May 15, 2026

Copy link
Copy Markdown
Contributor Author

accidentally included previous PR into this one, remaking

@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have tool/delegate Subagent delegation comp/gateway Gateway runner, session dispatch, delivery labels May 15, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #13898 (closed, not merged) and #16162 (closed as duplicate of #13898) — same _looks_like_error_output() heuristic bug. This PR reuses the existing function rather than re-implementing, which is cleaner. Also related to #5516 (open, broader scope).

@flooryyyy

Copy link
Copy Markdown
Contributor Author

already remade #26374

not sure if i should keep it up as the other ones are closed and #5516 has been open for over a month

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have tool/delegate Subagent delegation type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants