refactor: decompose AgentLoop and improve code quality#699
refactor: decompose AgentLoop and improve code quality#699SaiBalusu-usf wants to merge 1 commit intosipeed:mainfrom
Conversation
- Extract ContextCompressor (compression, summarization, token estimation) from AgentLoop - Extract ToolExecutor (tool dispatch, result routing, context updates) from AgentLoop - Create named constants replacing magic numbers (retry limits, thresholds, timeouts) - Replace fragile string-matching error detection with structured IsContextWindowError() - Add contextWindowPatterns and FailoverContextWindow to error classifier - Refactor provider factory to table-driven registry approach - Add graceful shutdown with context cancellation and WaitGroup - Net reduction of ~143 lines from loop.go via component extraction
SaiBalusu-usf
left a comment
There was a problem hiding this comment.
Systematic improvements to reduce complexity, improve maintainability, and add robustness to the PicoClaw codebase. Net reduction of ~143 lines from loop.go.
nikolasdehor
left a comment
There was a problem hiding this comment.
Good direction - extracting ContextCompressor, ToolExecutor, and named constants from the monolithic AgentLoop. The table-driven provider registry and structured IsContextWindowError are solid improvements. However, there are behavioral changes mixed in that need attention:
Behavioral regression in token estimation:
The old code used len(m.Content) / 2 (bytes) for the oversized message guard, while the new code uses utf8.RuneCountInString(m.Content) * 2 / 5 (runes). For pure ASCII text, len / 2 = 0.5 chars/token vs runes * 2/5 = 0.4 chars/token - meaning the new code is MORE aggressive at filtering messages as "oversized" and will skip more messages during summarization. For CJK text (3 bytes/rune), the old code would estimate ~1.5x more tokens, while the new code estimates correctly. This is a behavioral change that should be called out explicitly and tested, not buried in a refactor.
Missing context in MaybeSummarize:
The original maybeSummarize used al.bus.PublishOutbound(bus.OutboundMessage{...}) but the extracted version in ContextCompressor.MaybeSummarize calls cc.bus.PublishOutbound(bus.OutboundMessage{...}) without a context - the original had context.WithTimeout protection in the goroutine. The new code has no timeout on the publish, meaning it could block indefinitely if the bus is congested.
ToolExecutor drops iteration from logs:
The original tool call log included "iteration": iteration in the structured fields. The extracted ExecuteToolCalls drops this field entirely. This makes debugging tool call sequences harder.
Error classifier false positive risk:
substr("invalidparameter") in contextWindowPatterns is too broad. Any API error containing "invalidparameter" (e.g., invalid temperature, invalid model name) would be misclassified as a context window error, triggering unnecessary compression. The original code had this same issue with strings.Contains(errMsg, "invalidparameter"), but since this PR claims to be "more precise than naive string matching and avoids false positives", this should be tightened (e.g., rxp("invalidparameter.*token") or similar).
Factory refactor doubles the code:
The provider factory went from 190 lines to 290 lines (+100). The table-driven approach is good in principle, but the modelInferenceRegistry duplicates logic that was already in the switch statement. Consider whether both standardProviderRegistry and modelInferenceRegistry are needed, or if they can be unified.
No tests for extracted components:
ContextCompressor and ToolExecutor are new public types but have no unit tests. Since they are now independently testable (which is a key benefit of extraction), tests should accompany this PR.
Graceful shutdown not wired:
cancelCtx and wg are added to AgentLoop but wg.Add() is never called anywhere, so wg.Wait() in Stop() returns immediately. The shutdown mechanism is incomplete.
|
|
Extract ContextCompressor (compression, summarization, token estimation) from AgentLoop
Extract ToolExecutor (tool dispatch, result routing, context updates) from AgentLoop
Create named constants replacing magic numbers (retry limits, thresholds, timeouts)
Replace fragile string-matching error detection with structured IsContextWindowError()
Add contextWindowPatterns and FailoverContextWindow to error classifier
Refactor provider factory to table-driven registry approach
Add graceful shutdown with context cancellation and WaitGroup
Net reduction of ~143 lines from loop.go via component extraction
📝 Description
🗣️ Type of Change
🤖 AI Code Generation
🔗 Related Issue
📚 Technical Context (Skip for Docs)
🧪 Test Environment
📸 Evidence (Optional)
Click to view Logs/Screenshots
☑️ Checklist