fix: implement caching for system prompt to avoid repeated context reprocessing (fixes #607) by Zhaoyikaiii · Pull Request #617 · sipeed/picoclaw

Zhaoyikaiii · 2026-02-22T04:07:19Z

Summary

Add system prompt caching with BuildSystemPromptWithCache() using double-check locking pattern
Separate conversation summary from system prompt (now an independent assistant message)
Add InvalidateCache() method for when workspace files are modified
Add comprehensive test coverage for the caching mechanism

Problem

Issue #607: Prompt context keeps changing so the model has to reprocess thousands of tokens every turn. The system prompt was being rebuilt on every BuildMessages() call, and the summary was appended directly to it, causing KV cache to be ineffective.

Solution

Cache the system prompt after first build
Move summary to a separate message so system prompt remains constant
This enables KV cache to work properly - only the changing parts (history, current message) need reprocessing

Test plan

TestSystemPromptCaching - Verifies system prompt consistency across calls
TestBuildMessagesWithSummary - Confirms summary doesn't modify system prompt
TestSystemPromptStructure - Validates summary is in a separate message
TestCachedSystemPromptStability - Ensures cache works correctly
TestSummaryAsSeparateMessage - Confirms KV cache friendly structure
TestInvalidateCache - Validates cache invalidation mechanism
BenchmarkBuildMessages - Performance testing

References

Fixes #607

yinwm · 2026-02-24T07:12:40Z

PR Review Comment:

Issue: Multiple system messages are incompatible with Anthropic API

This PR splits the message structure into multiple system messages to optimize KV cache, but this creates a compatibility issue with the Anthropic API.

Reason:

According to the official Anthropic documentation, the Messages API does not support the system role in the messages array:

"Note that if you want to include a system prompt, you can use the top-level system parameter — there is no system role for input messages in the Messages API."

Source: https://docs.anthropic.com/en/api/messages

The messages parameter only accepts user and assistant roles. If messages with system role are sent, the API will return an error:

{
  "error": {
    "message": "messages: Unexpected role \"system\". The Messages API accepts a top-level system parameter, not \"system\" as an input message role.",
    "type": "invalid_request_error"
  }
}

Impact:

Direct Anthropic API usage
AWS Bedrock Claude integration
Google Vertex AI Claude integration

All these scenarios will fail with a 400 error when multiple system messages are sent.

Recommendation:

The format differences between APIs should be handled at the provider adapter layer, rather than enforcing a unified structure in the core message format.

References:

Anthropic Messages API Documentation: https://docs.anthropic.com/en/api/messages

nikolasdehor

Good approach to the caching problem. The double-checked locking pattern is correct, the mtime-based invalidation is a pragmatic choice, and sorting tool names for deterministic output is an important detail for KV cache stability.

However, there are some issues worth discussing:

The KV cache benefit is overstated in the PR description.
The comment in buildDynamicContext already acknowledges this honestly:

"since dynamic context is merged into the single system message, the caching benefit is local only (avoids repeated file I/O / string building), not LLM-side KV cache reuse."

Since Current Time changes every minute and is concatenated into the same system message, the LLM prefix hash changes every call regardless. The real benefit here is avoiding repeated file I/O and string construction — which is still worthwhile, but the PR title/description should be updated to reflect this accurately, especially since issue #607 specifically mentions LLM-side KV cache reuse.

To actually achieve LLM-side KV cache benefits, you would need to split the static content into a separate system message (role: system, cache_control: ephemeral for Anthropic) from the dynamic one. Some providers support this, some do not. The PR comment explains why a single system message was chosen (Anthropic/Codex compatibility), which is a reasonable tradeoff.

Removed loadSkills() method — is it still called elsewhere?
The PR removes the loadSkills() method from context.go. I see buildSkillsSection() exists separately, but please confirm no other code references loadSkills() or this will break the build.

sourceFilesChanged() is called under RLock but does file I/O:
The os.Stat calls inside sourceFilesChanged() are technically fine under RLock (it only blocks writers, not other readers), but if the filesystem is slow (NFS, network mount), this could cause reader contention. For a local filesystem this is a non-issue.

Race window between mtime check and cache read:
Between sourceFilesChanged() returning false and reading cachedSystemPrompt, a file could be modified. This is acceptable — the cache is eventually consistent and will catch the change on the next call.

Test coverage is thorough. The mtime auto-invalidation tests, cache stability tests, single-system-message verification, and benchmark are all well-designed.

Sorted tool names — good catch. Non-deterministic map iteration causing different tool definition ordering is a subtle but real issue for KV cache (and also makes diffs/debugging harder). This change in registry.go is correct and independently valuable.

Overall: the code is well-written, the caching mechanism is correct, and the test suite is comprehensive. The main suggestion is to align the PR description with the actual benefit (local I/O avoidance, not LLM KV cache reuse) since the dynamic context prevents true prefix caching.

yinwm · 2026-02-25T05:27:02Z

pkg/agent/context.go

 	}

-	// Log system prompt summary for debugging (debug mode only)
+	fullSystemPrompt := strings.Join(systemParts, "\n\n---\n\n")


Every time the fullSystemPrompt includes dynamic part, which means the final system prompt always change. Is this can hit KV Cache?

Avoid rebuilding the entire system prompt on every BuildMessages() call by caching the static portion (identity, bootstrap, skills summary, memory) and only recomputing it when workspace source files change. Key changes: - ContextBuilder caches the static prompt behind an RWMutex with double-checked locking. Source file changes are detected via cheap os.Stat mtime checks so no explicit invalidation is needed. - Track file existence at cache time (existedAtCache map) so that newly created or deleted bootstrap/memory files also trigger a rebuild — the old modifiedSince() silently returned false on os.IsNotExist. - Walk the skills directory recursively with filepath.WalkDir to catch content-only edits at any nesting depth; directory mtime alone misses in-place file modifications on most filesystems. - ToolRegistry.sortedToolNames() sorts tool names before iteration, ensuring deterministic tool definition order across calls — a prerequisite for LLM-side prefix/KV cache reuse. - Merge all context (static + dynamic + summary) into a single system message for provider compatibility: the Anthropic adapter extracts messages[0] as the top-level system parameter, and Codex reads only the first system message as instructions. - Fix a data race in BuildMessages() where cachedSystemPrompt was read without holding the lock in a debug log statement. - Add tests: single system message invariant, mtime auto-invalidation, new-file creation detection, skill file content change, explicit InvalidateCache, cache stability, concurrent access (20 goroutines x 50 iterations, passes go test -race), and a benchmark.

yinwm

LGTM

…eprocessing fix: implement caching for system prompt to avoid repeated context reprocessing (fixes sipeed#607)

Zhaoyikaiii requested a review from yinwm February 22, 2026 04:25

nikolasdehor reviewed Feb 24, 2026

View reviewed changes

Zhaoyikaiii force-pushed the fix/repeated-context-reprocessing branch 4 times, most recently from 9b2500a to 3200f0d Compare February 25, 2026 03:04

Zhaoyikaiii requested a review from nikolasdehor February 25, 2026 03:06

Zhaoyikaiii force-pushed the fix/repeated-context-reprocessing branch from 3200f0d to a719938 Compare February 25, 2026 03:18

yinwm reviewed Feb 25, 2026

View reviewed changes

Zhaoyikaiii force-pushed the fix/repeated-context-reprocessing branch from a719938 to e48bf57 Compare February 25, 2026 07:18

Zhaoyikaiii force-pushed the fix/repeated-context-reprocessing branch from e48bf57 to 1f7cbd9 Compare February 25, 2026 07:30

style: fix gci formatting in protocoltypes/types.go

edc7819

yinwm approved these changes Feb 25, 2026

View reviewed changes

yinwm merged commit 53578da into sipeed:main Feb 25, 2026
2 checks passed

rordd mentioned this pull request Feb 25, 2026

fix: exclude prompt_cache_key for Gemini API requests #790

Merged

hyperwd pushed a commit to hyperwd/picoclaw that referenced this pull request Mar 5, 2026

Merge pull request sipeed#617 from Zhaoyikaiii/fix/repeated-context-r…

406255d

…eprocessing fix: implement caching for system prompt to avoid repeated context reprocessing (fixes sipeed#607)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: implement caching for system prompt to avoid repeated context reprocessing (fixes #607)#617

fix: implement caching for system prompt to avoid repeated context reprocessing (fixes #607)#617
yinwm merged 2 commits intosipeed:mainfrom
Zhaoyikaiii:fix/repeated-context-reprocessing

Zhaoyikaiii commented Feb 22, 2026

Uh oh!

yinwm commented Feb 24, 2026

Uh oh!

nikolasdehor left a comment

Uh oh!

yinwm Feb 25, 2026

Uh oh!

yinwm left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Zhaoyikaiii commented Feb 22, 2026

Summary

Problem

Solution

Test plan

References

Uh oh!

yinwm commented Feb 24, 2026

Uh oh!

nikolasdehor left a comment

Choose a reason for hiding this comment

Uh oh!

yinwm Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

yinwm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants