fix(delegate): tool_trace false-positive error detection for short outputs#26369
Closed
flooryyyy wants to merge 2 commits into
Closed
fix(delegate): tool_trace false-positive error detection for short outputs#26369flooryyyy wants to merge 2 commits into
flooryyyy wants to merge 2 commits into
Conversation
On NixOS, systemd units don't inherit the user's PATH, so system
packages (ffmpeg, curl, kill) are invisible. This breaks STT (ffmpeg
for ogg conversion) and ExecReload (/bin/kill doesn't exist).
- Prepend /run/current-system/sw/{bin,sbin} to PATH in both user and
system unit templates (only when /etc/NIXOS exists)
- Add ~/.nix-profile/{bin,sbin} to user-local path candidates
- Use /usr/bin/env kill in ExecReload instead of hardcoded /bin/kill
The old heuristic checked if substring 'error' appeared in the first 80
chars of tool output JSON. For short outputs like:
{"output":"test1","exit_code":0,"error":null}
the '"error"' key lands within 80 chars → flagged as error.
Reuse _looks_like_error_output() which parses JSON properly so
"error": null / false don't trigger, while real errors with
non-null error values or Traceback lines still do.
Contributor
Author
|
accidentally included previous PR into this one, remaking |
Collaborator
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The delegate_tool tool_trace marked successful tool calls as
"status": "error"when the output was short.Root cause: the error detection heuristic checked if the substring
"error"appeared in the first 80 chars of the tool result:For short JSON outputs like:
{"output":"test1\n","exit_code":0,"error":null}The
"error"key (from"error":null) lands within the first 80 chars → false positive flagged as error.This only affected the TUI overlay display and tool_trace diagnostics — the tool itself executed successfully. Long outputs avoided the bug because the
"error"key was pushed past the 80-char window.Fix
Replace the substring heuristic with a call to
_looks_like_error_output()(already defined at line 272 in the same file), which:errorkey is truthy (null/false/""do not trigger)statusfield for error/failed/timeouterror:,traceback,exception:,failed:)Testing
Verified with the existing
_looks_like_error_outputfunction:{"output":"test1","exit_code":0,"error":null}→False(correct){"output":"","exit_code":127,"error":"command not found"}→True(correct)False(correct)Old heuristic: flagged the first case as
True(wrong).