A memory management toolkit for AI agents — Remember Me, Refine Me.
For the older version, please refer to the 0.2.x documentation.
🧠 ReMe is a memory management framework designed for AI agents, providing both file-based and vector-based memory systems.
It tackles two core problems of agent memory: limited context window (early information is truncated or lost in long conversations) and stateless sessions (new sessions cannot inherit history and always start from scratch).
ReMe gives agents real memory — old conversations are automatically compacted, important information is persistently stored, and relevant context is automatically recalled in future interactions.
What you can do with ReMe
- Personal assistant: Provide long-term memory for agents like CoPaw, remembering user preferences and conversation history.
- Coding assistant: Record code style preferences and project context, maintaining a consistent development experience across sessions.
- Customer service bot: Track user issue history and preference settings for personalized service.
- Task automation: Learn success/failure patterns from historical tasks to continuously optimize execution strategies.
- Knowledge Q&A: Build a searchable knowledge base with semantic search and exact matching support.
- Multi-turn dialogue: Automatically compress long conversations while retaining key information within limited context windows.
Memory as files, files as memory.
Treat memory as files — readable, editable, and copyable.
CoPaw integrates long-term memory and context management by inheriting from
ReMeLight.
| Traditional memory system | File-based ReMe |
|---|---|
| 🗄️ Database storage | 📝 Markdown files |
| 🔒 Opaque | 👀 Always readable |
| ❌ Hard to modify | ✏️ Directly editable |
| 🚫 Hard to migrate | 📦 Copy to migrate |
working_dir/
├── MEMORY.md # Long-term memory: persistent info such as user preferences
├── memory/
│ └── YYYY-MM-DD.md # Daily journal: automatically written after each conversation
└── tool_result/ # Cache for long tool outputs (auto-managed, expired entries auto-cleaned)
└── <uuid>.txt
ReMeLight is the core class of the file-based memory system. It provides full memory management capabilities for AI agents:
| Method | Function | Key components |
|---|---|---|
check_context |
📊 Check context size | ContextChecker — checks whether context exceeds thresholds and splits messages |
compact_memory |
📦 Compact history into summary | Compactor — ReActAgent that generates structured context summaries |
summary_memory |
📝 Persist important memory to files | Summarizer — ReActAgent + file tools (read / write / edit) |
compact_tool_result |
✂️ Compact long tool outputs | ToolResultCompactor — truncates long tool outputs and stores them in tool_result/ while keeping file references in messages |
memory_search |
🔍 Semantic memory search | MemorySearch — hybrid retrieval with vectors + BM25 |
ReMeInMemoryMemory |
🗂️ In-session memory class | ReMeInMemoryMemory — token-aware memory management with summary compression and state serialization |
pre_reasoning_hook |
🔄 Pre-reasoning hook | compact_tool_result + check_context + compact_memory + summary_memory (async) |
start |
🚀 Start memory system | Initialize file storage, file watcher, and embedding cache; clean up expired tool result files |
close |
📕 Shutdown and cleanup | Clean up tool result files, stop file watcher, and persist embedding cache |
Install from source:
git clone https://github.com/agentscope-ai/ReMe.git
cd ReMe
pip install -e ".[light]"Update to the latest version:
git pull
pip install -e ".[light]"ReMeLight uses environment variables to configure the embedding model and storage backends:
| Variable | Description | Example |
|---|---|---|
LLM_API_KEY |
LLM API key | sk-xxx |
LLM_BASE_URL |
LLM base URL | https://dashscope.aliyuncs.com/compatible-mode/v1 |
EMBEDDING_API_KEY |
Embedding API key (optional) | sk-xxx |
EMBEDDING_BASE_URL |
Embedding base URL (optional) | https://dashscope.aliyuncs.com/compatible-mode/v1 |
import asyncio
from reme.reme_light import ReMeLight
async def main():
# Initialize ReMeLight
reme = ReMeLight(
default_as_llm_config={"model_name": "qwen3.5-35b-a3b"},
# default_embedding_model_config={"model_name": "text-embedding-v4"},
default_file_store_config={"fts_enabled": True, "vector_enabled": False},
)
await reme.start()
messages = [...] # List of conversation messages
# 1. Compact long tool outputs (prevent tool results from blowing up context)
messages = await reme.compact_tool_result(messages)
# 2. Compact conversation history into a structured summary
summary = await reme.compact_memory(
messages=messages,
previous_summary="",
max_input_length=128000, # Model context window (tokens)
compact_ratio=0.7, # Trigger compaction when exceeding max_input_length * 0.7
language="zh", # Summary language (e.g., "zh" / "")
)
# 3. Submit summary task asynchronously (non-blocking, writes to memory/YYYY-MM-DD.md)
reme.add_async_summary_task(messages=messages)
# 4. Pre-reasoning hook (auto compact tool results + generate summaries)
processed_messages, compressed_summary = await reme.pre_reasoning_hook(
messages=messages,
system_prompt="You are a helpful AI assistant.",
compressed_summary="",
max_input_length=128000,
compact_ratio=0.7,
memory_compact_reserve=10000,
enable_tool_result_compact=True,
tool_result_compact_keep_n=3,
)
# 5. Semantic memory search (vector + BM25 hybrid retrieval)
result = await reme.memory_search(query="Python version preference", max_results=5)
# 6. Create in-session memory instance (manages context for one conversation)
from reme.memory.file_based.reme_in_memory_memory import ReMeInMemoryMemory
memory = ReMeInMemoryMemory()
for msg in messages:
await memory.add(msg)
token_stats = await memory.estimate_tokens(max_input_length=128000)
print(f"Current context usage: {token_stats['context_usage_ratio']:.1f}%")
print(f"Message token count: {token_stats['messages_tokens']}")
print(f"Estimated total tokens: {token_stats['estimated_tokens']}")
# 7. Wait for background summary tasks to complete before shutdown
summary_result = await reme.await_summary_tasks()
# Shutdown ReMeLight
await reme.close()
if __name__ == "__main__":
asyncio.run(main())📂 Full example: test_reme_light.py 📋 Sample run log: test_reme_light_log.txt (223,838 tokens → 1,105 tokens, 99.5% compression)
CoPaw MemoryManager
inherits
ReMeLight and integrates its memory capabilities into the agent reasoning loop:
graph LR
Agent[Agent] -->|Before each reasoning step| Hook[pre_reasoning_hook]
Hook --> TC[compact_tool_result<br>Compact tool outputs]
TC --> CC[check_context<br>Token counting]
CC -->|Exceeds limit| CM[compact_memory<br>Generate summary]
CC -->|Exceeds limit| SM[summary_memory<br>Async persistence]
SM -->|ReAct + FileIO| Files[memory/*.md]
Agent -->|Explicit call| Search[memory_search<br>Vector+BM25]
Agent -->|In - session| InMem[ReMeInMemoryMemory<br>Token-aware memory]
Files -.->|FileWatcher| Store[(FileStore<br>Vector+FTS index)]
Search --> Store
ContextChecker uses token counting to determine whether the context exceeds thresholds and automatically splits messages into a "to compact" group and a "to keep" group.
graph LR
M[messages] --> H[AsMsgHandler<br>Token counting]
H --> C{total > threshold?}
C -->|No| K[Return all messages]
C -->|Yes| S[Keep from tail<br>reserve tokens]
S --> CP[messages_to_compact<br>Earlier messages]
S --> KP[messages_to_keep<br>Recent messages]
S --> V{is_valid<br>Tool calls aligned?}
- Core logic: keep
reservetokens from the tail; mark the rest as messages to compact. - Integrity guarantee: preserves complete user-assistant turns and tool_use/tool_result pairs without splitting them.
Compactor uses a ReActAgent to compact conversation history into a * structured context summary*.
graph LR
M[messages] --> H[AsMsgHandler<br>format_msgs_to_str]
H --> A[ReActAgent<br>reme_compactor]
P[previous_summary] -->|Incremental update| A
A --> S[Structured summary<br>Goal/Progress/Decisions...]
Summary structure (context checkpoints):
| Field | Description |
|---|---|
## Goal |
User goals |
## Constraints |
Constraints and preferences |
## Progress |
Task progress |
## Key Decisions |
Key decisions |
## Next Steps |
Next step plans |
## Critical Context |
Critical data such as file paths, function names, error messages, etc. |
- Incremental updates: when
previous_summaryis provided, new conversations are merged into the existing summary.
Summarizer uses a ReAct + file tools pattern so that the AI can decide what to write and where to write it.
graph LR
M[messages] --> A[ReActAgent<br>reme_summarizer]
A -->|read| R[Read memory/YYYY-MM-DD.md]
R --> T{Reason: how to merge?}
T -->|write| W[Overwrite]
T -->|edit| E[Edit in place]
W --> F[memory/YYYY-MM-DD.md]
E --> F
File tools (FileIO):
| Tool | Function |
|---|---|
read |
Read file content |
write |
Overwrite file |
edit |
Find-and-replace edit |
ToolResultCompactor addresses the problem of long tool outputs bloating the context.
graph LR
M[messages] --> L{Iterate tool_result<br>len > threshold?}
L -->|No| K[Keep as-is]
L -->|Yes| T[truncate_text<br>Truncate to threshold]
T --> S[Write full content<br>tool_result/uuid.txt]
S --> R[Append file path reference<br>to message]
R --> C[cleanup_expired_files<br>Delete expired files]
- Auto cleanup: expired files (older than
retention_days) are deleted automatically duringstart/close/compact_tool_result.
MemorySearch provides vector + BM25 hybrid retrieval.
graph LR
Q[query] --> E[Embedding<br>Vectorization]
E --> V[vector_search<br>Semantic similarity]
Q --> B[BM25<br>Keyword matching]
V -->|" weight: 0.7 "| M[Deduplicate + weighted merge]
B -->|" weight: 0.3 "| M
M --> F[min_score filter]
F --> R[Top-N results]
- Fusion mechanism: vector weight 0.7 + BM25 weight 0.3 — balancing semantic similarity and exact matches.
ReMeInMemoryMemory extends AgentScope's InMemoryMemory to provide
token-aware memory management.
graph LR
C[content] --> G[get_memory<br>exclude_mark=COMPRESSED]
G --> F[Filter out compressed messages]
F --> P{prepend_summary?}
P -->|Yes| S[Prepend previous summary]
S --> O[Output messages]
P -->|No| O
| Function | Description |
|---|---|
get_memory |
Filter messages by mark and auto-append summary |
estimate_tokens |
Estimate token usage of the context |
state_dict / load_state_dict |
Serialize/deserialize state (session persistence) |
This is a unified entry point that wires all the above components together and automatically manages context before each reasoning step.
graph LR
M[messages] --> TC[compact_tool_result<br>Compact long tool outputs]
TC --> CC[check_context<br>Compute remaining space]
CC --> D{messages_to_compact<br>Non-empty?}
D -->|No| K[Return original messages + summary]
D -->|Yes| V{is_valid?}
V -->|No| K
V -->|Yes| CM[compact_memory<br>Sync summary generation]
V -->|Yes| SM[add_async_summary_task<br>Async persistence]
CM --> R[Return messages_to_keep + new summary]
Execution flow:
compact_tool_result— compact long tool outputs.check_context— check whether the context exceeds limits.compact_memory— generate compact summary (sync).summary_memory— persist memory (async in the background).
ReMe Vector Based is the core class for the vector-based memory system. It manages three types of memories:
| Memory type | Use case |
|---|---|
| Personal memory | Records user preferences and habits |
| Procedural memory | Records task execution experience and patterns of success/failure |
| Tool memory | Records tool usage experience and parameter tuning |
| Method | Function | Description |
|---|---|---|
summarize_memory |
🧠 Summarize | Automatically extract and store memories from conversations |
retrieve_memory |
🔍 Retrieve | Retrieve related memories based on a query |
add_memory |
➕ Add | Manually add memories into the vector store |
get_memory |
📖 Get | Get a single memory by ID |
update_memory |
✏️ Update | Update existing memory content or metadata |
delete_memory |
🗑️ Delete | Delete a specific memory |
list_memory |
📋 List | List memories with filtering and sorting |
Installation and environment configuration are the same as ReMeLight.
API keys are configured via environment variables and can be stored in a .env file at the project root.
import asyncio
from reme import ReMe
async def main():
# Initialize ReMe
reme = ReMe(
working_dir=".reme",
default_llm_config={
"backend": "openai",
"model_name": "qwen3.5-plus",
},
default_embedding_model_config={
"backend": "openai",
"model_name": "text-embedding-v4",
"dimensions": 1024,
},
default_vector_store_config={
"backend": "local", # Supports local/chroma/qdrant/elasticsearch
},
)
await reme.start()
messages = [
{"role": "user", "content": "Help me write a Python script", "time_created": "2026-02-28 10:00:00"},
{"role": "assistant", "content": "Sure, I'll help you with that.", "time_created": "2026-02-28 10:00:05"},
]
# 1. Summarize memories from conversation (automatically extract user preferences, task experience, etc.)
result = await reme.summarize_memory(
messages=messages,
user_name="alice", # Personal memory
# task_name="code_writing", # Procedural memory
)
print(f"Summary result: {result}")
# 2. Retrieve related memories
memories = await reme.retrieve_memory(
query="Python programming",
user_name="alice",
# task_name="code_writing",
)
print(f"Retrieved memories: {memories}")
# 3. Manually add a memory
memory_node = await reme.add_memory(
memory_content="The user prefers concise code style.",
user_name="alice",
)
print(f"Added memory: {memory_node}")
memory_id = memory_node.memory_id
# 4. Get a single memory by ID
fetched_memory = await reme.get_memory(memory_id=memory_id)
print(f"Fetched memory: {fetched_memory}")
# 5. Update memory content
updated_memory = await reme.update_memory(
memory_id=memory_id,
user_name="alice",
memory_content="The user prefers concise code with comments.",
)
print(f"Updated memory: {updated_memory}")
# 6. List all memories for the user (supports filtering and sorting)
all_memories = await reme.list_memory(
user_name="alice",
limit=10,
sort_key="time_created",
reverse=True,
)
print(f"User memory list: {all_memories}")
# 7. Delete a specific memory
await reme.delete_memory(memory_id=memory_id)
print(f"Deleted memory: {memory_id}")
# 8. Delete all memories (use with care)
# await reme.delete_all()
await reme.close()
if __name__ == "__main__":
asyncio.run(main())graph LR
User[User / Agent] --> ReMe[Vector Based ReMe]
ReMe --> Summarize[Summarize memories]
ReMe --> Retrieve[Retrieve memories]
ReMe --> CRUD[CRUD operations]
Summarize --> PersonalSum[PersonalSummarizer]
Summarize --> ProceduralSum[ProceduralSummarizer]
Summarize --> ToolSum[ToolSummarizer]
Retrieve --> PersonalRet[PersonalRetriever]
Retrieve --> ProceduralRet[ProceduralRetriever]
Retrieve --> ToolRet[ToolRetriever]
PersonalSum --> VectorStore[Vector database]
ProceduralSum --> VectorStore
ToolSum --> VectorStore
PersonalRet --> VectorStore
ProceduralRet --> VectorStore
ToolRet --> VectorStore
Coming soon...
Our procedural (task) memory paper is available on arXiv.
We evaluate ReMe on the Appworld environment using Qwen3-8B (non-thinking mode):
| Method | Avg@4 | Pass@4 |
|---|---|---|
| w/o ReMe | 0.1497 | 0.3285 |
| w/ ReMe | 0.1706 (+2.09%) | 0.3631 (+3.46%) |
Pass@K measures the probability that at least one of K generated candidates successfully completes the task (score=1). The current experiments use an internal AppWorld environment, which may differ slightly from the public version.
For more details on how to reproduce the experiments, see quickstart.md.
We evaluate ReMe on the BFCL-V3 multi-turn-base task (random split 50 train / 150 val) using Qwen3-8B (thinking mode):
| Method | Avg@4 | Pass@4 |
|---|---|---|
| w/o ReMe | 0.4033 | 0.5955 |
| w/ ReMe | 0.4450 (+4.17%) | 0.6577 (+6.22%) |
For more details on how to reproduce the experiments, see quickstart.md.
- Star & Watch: Starring helps more agent developers discover ReMe; Watching keeps you up to date with new releases and features.
- Share your results: Share how ReMe empowers your agents in Issues or Discussions — we are happy to showcase great community use cases.
- Need a new feature? Open a feature request; we’ll evolve ReMe together with the community.
- Code contributions: All forms of contributions are welcome. Please see the contribution guide.
- Acknowledgements: We thank excellent open-source projects such as OpenClaw, Mem0, MemU, and CoPaw for their inspiration and support.
Thanks to all who have contributed to ReMe:
@software{AgentscopeReMe2025,
title = {AgentscopeReMe: Memory Management Kit for Agents},
author = {ReMe Team},
url = {https://reme.agentscope.io},
year = {2025}
}This project is open-sourced under the Apache License 2.0. See LICENSE for details.
ReMe stands for Remember Me and Refine Me, symbolizing our goal to help AI agents "remember" users and "refine" themselves through interactions. We hope ReMe is not just a cold memory module, but a partner that truly helps agents understand users, accumulate experience, and continuously evolve.
