Problem
OpenClaw currently hardcodes the injection of MEMORY.md as a complete workspace file into every session's system prompt. As the file grows with project history, this creates significant token waste — injecting 6,500+ tokens of historical context even when only 10% is relevant to the current conversation.
Proposed Solution
Add a new configuration option:
{
"agents": {
"defaults": {
"memoryInjection": "full"
}
}
}
Values
| Value |
Behavior |
full (default) |
Current behavior — inject entire MEMORY.md |
core-only |
Inject MEMORY.md but respect a max size (e.g., memoryInjectionMaxChars) |
recall-only |
Don't inject MEMORY.md at all — rely entirely on memory_search for context retrieval |
Why
- Token savings: Users with large MEMORY.md files (20KB+) waste ~6,500 tokens per session on static context
- Better context quality: Selective recall via
memory_search provides more relevant context than full injection
- Works with existing infra:
memory.backend: "qmd" already enables semantic search over MEMORY.md — this would complete the loop by making injection configurable
- Backward compatible: Default
full preserves current behavior
Current Workaround
We maintain a compact MEMORY.md (~80 lines) for injection and a separate MEMORY-ARCHIVE.md with full history indexed by qmd. The auto-recall protocol in AGENTS.md triggers memory_search before contextual responses. This works but requires manual file management.
Related
memory.backend: "qmd" config already exists and works well
agents.defaults.bootstrapMaxChars exists for truncation but doesn't allow skip
- The supermemory plugin solves this via cloud — a native local option would benefit self-hosted users
Environment
- OpenClaw v2026.2.19-2
- memory.backend: qmd
- 100+ files indexed, 416 vectors
Problem
OpenClaw currently hardcodes the injection of
MEMORY.mdas a complete workspace file into every session's system prompt. As the file grows with project history, this creates significant token waste — injecting 6,500+ tokens of historical context even when only 10% is relevant to the current conversation.Proposed Solution
Add a new configuration option:
{ "agents": { "defaults": { "memoryInjection": "full" } } }Values
full(default)core-onlymemoryInjectionMaxChars)recall-onlymemory_searchfor context retrievalWhy
memory_searchprovides more relevant context than full injectionmemory.backend: "qmd"already enables semantic search over MEMORY.md — this would complete the loop by making injection configurablefullpreserves current behaviorCurrent Workaround
We maintain a compact MEMORY.md (~80 lines) for injection and a separate MEMORY-ARCHIVE.md with full history indexed by qmd. The auto-recall protocol in AGENTS.md triggers
memory_searchbefore contextual responses. This works but requires manual file management.Related
memory.backend: "qmd"config already exists and works wellagents.defaults.bootstrapMaxCharsexists for truncation but doesn't allow skipEnvironment