Skip to content

Fix #5: Refactor failed tool call detection to use JSON parsing#6

Merged
edenreich merged 2 commits intomainfrom
fix/issue-5
Nov 26, 2025
Merged

Fix #5: Refactor failed tool call detection to use JSON parsing#6
edenreich merged 2 commits intomainfrom
fix/issue-5

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Summary

This PR resolves issue #5 by refactoring the failed tool call detection in the infer workflow to use proper JSON parsing instead of permissive regex patterns.

Problem

The current implementation uses regex patterns that are too permissive and can match content that isn't actually failed tool calls:

grep -i "Tool execution failed\|command not whitelisted\|error" /tmp/agent-output.txt

Solution

Replace the regex-based approach with proper JSON parsing using jq to accurately detect when "success": false occurs in tool call results.

Changes

  • Extract JSON objects containing "success": false from agent output
  • Use jq to parse and format failed tool calls with tool names and error messages
  • Maintain backward compatibility with fallback to regex for non-JSON errors
  • Properly handle edge cases and errors in JSON parsing
  • Clean up temporary files properly

Implementation Details

# Extract JSON objects with "success": false from the agent output
grep -o '{"success": false[^}]*}' /tmp/agent-output.txt > /tmp/failed-tool-calls-raw.txt 2>/dev/null || true

# Format failed tool calls nicely
cat /tmp/failed-tool-calls-raw.txt | jq -r '. | "Tool: \(.tool_name // .name // "unknown"), Error: \(.error // .message // "unknown error")"' 2>/dev/null

Benefits

  1. More accurate detection - Only identifies actual failed tool calls with "success": false
  2. Better formatting - Provides structured output with tool names and error messages
  3. Reduced false positives - Avoids matching unrelated content that contains error-like text
  4. Backward compatible - Falls back to original approach if no JSON-formatted failed tool calls are found

Testing

The implementation has been tested to ensure:

  • JSON-formatted failed tool calls are properly detected and formatted
  • Non-JSON errors still trigger the fallback regex approach
  • Edge cases (malformed JSON, missing fields) are handled gracefully
  • Cleanup properly removes all temporary files

Resolves #5

…of regex

- Replace regex-based pattern matching with proper JSON parsing using jq
- Extract JSON objects containing "success": false from agent output
- Format failed tool calls with tool names and error messages
- Maintain backward compatibility with fallback to regex for non-JSON errors
- Properly handle edge cases and errors in JSON parsing
- Resolves #5
@edenreich
Copy link
Copy Markdown
Contributor

@infer Don't maintain backward compatibility, please keep it simple, only use the jq approach to check specifically for "success" attribute.

@github-actions
Copy link
Copy Markdown
Contributor Author

❌ Infer Agent Result: Failed

Model: deepseek/deepseek-chat
Exit Code: ``
Workflow: View Job

Agent Output
No output available
Failed Tool Calls
No failed tool calls detected

Triggered by edenreich | Infer Action

@edenreich edenreich merged commit 562cb64 into main Nov 26, 2025
@edenreich edenreich deleted the fix/issue-5 branch November 26, 2025 22:22
github-actions bot pushed a commit that referenced this pull request Nov 26, 2025
## [0.3.1](v0.3.0...v0.3.1) (2025-11-26)

### Bug Fixes

* Properly escape GitHub event data in bash scripts ([48e8df7](48e8df7))
* Refactor failed tool call detection to use JSON parsing ([#6](#6)) ([562cb64](562cb64)), closes [#5](#5)
* **workflow:** Update system prompt to follow conventional commit format ([#8](#8)) ([efe0b62](efe0b62))

### Documentation

* Update model provider in README ([90a236b](90a236b))

### Maintenance

* Update version ([4429ff2](4429ff2))
@github-actions
Copy link
Copy Markdown
Contributor Author

🎉 This PR is included in version 0.3.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[TASK] Refactor fix the infer workflow

1 participant