feat(ai): implement three-layer caching optimization (#91, #92, #93)#118
Merged
Conversation
fa9915c to
37de13c
Compare
- Add SemanticCache with two-layer lookup (SHA256 + cosine similarity) * Exact match cache using SHA256 hash keys * Semantic match cache using vector similarity (default threshold: 0.95) * LRU eviction with TTL support (default 24 hours) - Add ToolResultCache for tool execution results * Per-tool TTL configuration (schedule_query: 30s, memo_search: 5m) * Write operations not cached (TTL = 0) * Multi-tenant isolation by user ID - Add profile-based token budget allocation (#93) * 7 preset profiles: memo_search, schedule_create, schedule_query, amazing, geek, evolution, default * Environment variable overrides (DIVINESENSE_BUDGET_*) * GEEK and EVOLUTION modes use zero budget (Claude Code CLI manages context) - Add comprehensive unit tests for both cache layers - Fix errcheck issues with //nolint:errcheck comments for safe type assertions - Load budget profile overrides from environment on service initialization Refs #91, #92, #93 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
37de13c to
7678595
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
概述
实现三层 AI 缓存优化架构,提升响应速度、降低 API 成本、优化 Token 预算分配。
变更内容
新增文件
ai/cache/semantic.go- 语义缓存实现ai/cache/semantic_test.go- 语义缓存测试ai/agent/tools/cache.go- 工具结果缓存实现ai/agent/tools/cache_test.go- 工具缓存测试ai/context/budget_profiles.go- 预算配置文件修改文件
ai/context/budget.go- 添加AllocateForAgent()方法ai/context/builder_impl.go- 使用基于 AgentType 的预算分配.env.example- 添加缓存配置说明和默认值缓存架构
Token 预算配置
测试计划
截图/演示
环境变量配置示例
检查清单
Resolves #91, #92, #93
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com