Refactor reflection system to use structured tool calls by v3g42 · Pull Request #17 · distrihub/distri

v3g42 · 2026-02-14T19:47:26Z

Summary

This PR refactors the reflection agent system to use structured tool calls instead of text pattern matching. The reflection agent now uses a dedicated reflect tool to report its analysis decision, making the system more robust and type-safe.

Key Changes

Reflection Configuration & Agent Definition

Added reflection_agent field to ReflectionConfig to allow custom reflection agents (defaults to built-in)
Replaced enable_reflection: bool with reflection: Option<ReflectionConfig> in StandardDefinition for more flexible configuration
Added validate_reflection_agent() method to ensure reflection agents have the "reflect" tool configured
Updated helper methods: is_reflection_enabled() and new reflection_config() getter

Reflect Tool Implementation

Implemented new ReflectTool as a built-in tool with structured parameters:
- quality: enum (excellent/good/fair/poor)
- completeness: enum (complete/partial/incomplete)
- should_continue: boolean (retry decision)
- reason: optional explanation
Tool stores its structured result as the final result in ExecutorContext for downstream processing
Added to builtin tools list in get_builtin_tools()

Reflection Agent Loop

Updated run_reflection_agent() to return ReflectionResult containing both text content and structured tool call result
Modified reflection decision logic to extract should_continue from the tool call output instead of text pattern matching ("Should Continue: YES/NO")
Removed reliance on string matching in reflection agent output

Reflection Agent Markdown

Updated reflection_agent.md to:
- Configure the "reflect" tool in its tools.builtin
- Instruct agent to use the reflect tool instead of text output
- Update decision rules to match tool parameter values

Execution Result Formatting

Added truncation logic to ExecutionResult::as_observation() to prevent token bloat:
- Text parts: max 1000 chars
- Data/ToolResult parts: max 500 chars
- Includes truncation indicator with total character count

Cleanup & Removals

Removed unused BrowserHooksConfig enum and related field from StandardDefinition
Removed browser_hooks configuration option

Test Improvements

Added test_store_config() helper to create in-memory SQLite databases for tests
Updated tests to use in-memory stores instead of filesystem dependencies
Added API key environment variable checks to skip tests when credentials unavailable
Fixed test isolation issues in prompt store and auth provider tests

Implementation Details

The reflection agent now signals its decision through a tool call rather than text parsing, improving reliability
Custom reflection agents can be specified but must explicitly include the "reflect" tool
The built-in reflection agent automatically gets the reflect tool
Execution results are now truncated in observations to prevent context window bloat while preserving full data in storage

https://claude.ai/code/session_019JhhSoRA6gQZNvpWsGWxNN

…x all tests - Add ReflectTool as a builtin tool for structured reflection decisions instead of parsing "Should Continue" from agent text output - Update reflection_agent.md to use the reflect tool - Add reflection_agent field to ReflectionConfig for custom reflection agents - Add validation for reflection agent having reflect tool configured - Remove unused browser_hooks from StandardDefinition and BrowserHooksConfig - Update agent_loop reflect() to extract should_continue from tool call result - Return ReflectionResult with structured final_result from reflection agent Test fixes: - Add env var guards for tests requiring OPENAI_API_KEY or TAVILY_API_KEY - Use in-memory SQLite (file:{uuid}?mode=memory) for all test orchestrators - Fix a2a_types serialize test with round-trip instead of exact string match - Fix parse_agent_definition assertion to match actual max_iterations=30 - Fix provider_registry test to load providers before querying - Fix live_llm_execute test to use LlmExecuteOptions builder pattern - Fix code execution test tracing_subscriber double-init panic - Add truncation to execution as_observation() for large tool results - Rewrite prompt store tests to use HashMapPromptStore (no filesystem deps) - Guard typescript plugin test for missing sample directory https://claude.ai/code/session_019JhhSoRA6gQZNvpWsGWxNN

v3g42 merged commit 0585d0e into main Mar 9, 2026

v3g42 deleted the claude/reflection-tool-call-CRAjG branch March 9, 2026 02:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor reflection system to use structured tool calls#17

Refactor reflection system to use structured tool calls#17
v3g42 merged 1 commit intomainfrom
claude/reflection-tool-call-CRAjG

v3g42 commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

v3g42 commented Feb 14, 2026

Summary

Key Changes

Reflection Configuration & Agent Definition

Reflect Tool Implementation

Reflection Agent Loop

Reflection Agent Markdown

Execution Result Formatting

Cleanup & Removals

Test Improvements

Implementation Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants