One memory layer for agents, teams, and AI systems.
Memory Core is a self-hosted service that gives AI agents a shared, searchable, temporally-aware memory layer. It exposes a single REST API, a Python SDK, and an MCP server — so any agent, script, or tool can read and write memory through the same interface, against the same backend.
Memory Core sits between your agents and their memory backends. It handles:
- Storage — persist observations, decisions, learnings, and events
- Retrieval — semantic search across all stored memory
- Temporal tracking — facts are time-anchored; contradictions auto-invalidate older facts (via Graphiti)
- Routing — queries are dispatched to the right backend (fast vector search vs. temporal graph) based on query flags
- Access — via REST API, Python SDK, or MCP tools (Claude Code / Claude Desktop)
| Memory Core | |
|---|---|
| Unified API | One endpoint regardless of backend — swap or add backends without changing agent code |
| Shared memory | All agents in a team read and write the same namespace |
| Temporal memory | Facts carry valid_at / invalid_at bounds; Graphiti auto-invalidates stale facts |
| MCP server | Claude Code and Claude Desktop can use memory as native tools |
| Python SDK | MemoryClient and AsyncMemoryClient for direct integration |
| Real integration | Currently integrated with evan_core — not a toy demo |
| Local-first | Temporal layer runs on local Ollama — no cloud API key required |
POST /api/v1/memories/— store a memory unit (type, tags, episode grouping, TTL)POST /api/v1/memories/search— semantic search with optional temporal routingGET /api/v1/memories/{id}— retrieve by IDDELETE /api/v1/memories/{id}— delete from all backendsGET /health— backend health across all registered adapters- MCP tools:
memory_store,memory_search,memory_recall,memory_delete,memory_health - Python SDK:
MemoryClient,AsyncMemoryClient
┌─────────────────────────────────────────────────────────────┐
│ Clients │
│ HTTP API │ Python SDK │ MCP (Claude Code / Desktop) │
└────────────┴──────────────┴─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Memory Core — FastAPI unified service │
│ QueryRouter │
│ routes by: default → primary │ include_temporal → graph │
└───────────────────┬─────────────────────┬───────────────────┘
│ │
┌─────────▼────────┐ ┌────────▼──────────────────┐
│ mcp-memory-service│ │ Graphiti (optional) │
│ ChromaDB + SQLite │ │ FalkorDB knowledge graph │
│ semantic search │ │ temporal fact tracking │
│ (primary backend) │ │ LLM: Anthropic/OpenAI/ │
└──────────────────┘ │ Ollama (local) │
└───────────────────────────┘
MCP server (stdio / HTTP) wraps the REST API for Claude integration.
Python SDK wraps the REST API for direct code integration.
evan_core uses EvanMemory / AsyncEvanMemory (thin SDK wrappers).
- Python 3.11+
- Docker (for mcp-memory-service and optionally FalkorDB)
docker run -d --name mcp-memory \
-p 8001:8001 \
-e HOST=0.0.0.0 -e PORT=8001 \
-e MEMORY_STORAGE_PATH=/data/memory.sqlite \
-v mcp_memory_data:/data \
doobidoo/mcp-memory-service:latestOr use Docker Compose (minimal dev stack):
docker compose -f infra/docker-compose.dev.yml up -dpip install fastapi "uvicorn[standard]" pydantic pydantic-settings httpx fastmcpFor temporal memory (optional):
pip install "graphiti-core[falkordb]" falkordbMCP_MEMORY_ENABLED=true \
MCP_MEMORY_URL=http://127.0.0.1:8001 \
PYTHONPATH=packages \
uvicorn memory_core_api.main:app --host 127.0.0.1 --port 8000curl http://127.0.0.1:8000/healthExpected:
{
"router": "ok",
"adapters": {
"mcp-memory-service": {"status": "ok"}
},
"primary": "mcp-memory-service"
}| Variable | Default | Description |
|---|---|---|
MCP_MEMORY_ENABLED |
true |
Enable mcp-memory-service backend |
MCP_MEMORY_URL |
http://127.0.0.1:8001 |
mcp-memory-service base URL |
MEMORY_CORE_LOG_LEVEL |
info |
Log level |
GRAPHITI_ENABLED |
false |
Enable Graphiti temporal layer |
FALKORDB_HOST |
localhost |
FalkorDB host |
FALKORDB_PORT |
6379 |
FalkorDB port |
ANTHROPIC_API_KEY |
— | Anthropic key for Graphiti LLM |
OPENAI_API_KEY |
— | OpenAI key for Graphiti LLM |
GRAPHITI_USE_OLLAMA |
false |
Use local Ollama instead of cloud LLM |
OLLAMA_BASE_URL |
http://127.0.0.1:11434/v1 |
Ollama API base URL |
OLLAMA_LLM_MODEL |
qwen3:8b |
Ollama chat model |
OLLAMA_EMBED_MODEL |
qwen3-embedding:latest |
Ollama embedding model |
OLLAMA_EMBED_DIM |
4096 |
Embedding dimension |
See .env.example for a complete template.
Temporal memory requires FalkorDB and an LLM for entity extraction. Three options:
Option A — Ollama (local, no API key):
# Start FalkorDB
docker run -d --name falkordb -p 6379:6379 falkordb/falkordb:latest
# Pull Ollama models
ollama pull qwen3:8b
ollama pull qwen3-embedding:latest
# Start API with temporal enabled
NO_PROXY=127.0.0.1,localhost \
GRAPHITI_ENABLED=true \
GRAPHITI_USE_OLLAMA=true \
OLLAMA_BASE_URL=http://127.0.0.1:11434/v1 \
MCP_MEMORY_ENABLED=true MCP_MEMORY_URL=http://127.0.0.1:8001 \
PYTHONPATH=packages \
uvicorn memory_core_api.main:app --host 127.0.0.1 --port 8000Note:
NO_PROXY=127.0.0.1,localhostis required on macOS when a system proxy is configured, to prevent the OpenAI-compatible client from routing localhost traffic through it.
Option B — Anthropic:
GRAPHITI_ENABLED=true ANTHROPIC_API_KEY=sk-ant-... \
MCP_MEMORY_ENABLED=true MCP_MEMORY_URL=http://127.0.0.1:8001 \
PYTHONPATH=packages \
uvicorn memory_core_api.main:app --host 127.0.0.1 --port 8000Option C — OpenAI:
GRAPHITI_ENABLED=true OPENAI_API_KEY=sk-... \
MCP_MEMORY_ENABLED=true MCP_MEMORY_URL=http://127.0.0.1:8001 \
PYTHONPATH=packages \
uvicorn memory_core_api.main:app --host 127.0.0.1 --port 8000# Store
curl -X POST http://127.0.0.1:8000/api/v1/memories/ \
-H "Content-Type: application/json" \
-d '{
"content": "We chose FalkorDB for lower operational overhead.",
"namespace": "my-project",
"type": "decision",
"tags": ["database", "architecture"]
}'
# Search
curl -X POST http://127.0.0.1:8000/api/v1/memories/search \
-H "Content-Type: application/json" \
-d '{
"query": "database architecture",
"namespace": "my-project",
"top_k": 5
}'
# Temporal search (requires Graphiti)
curl -X POST http://127.0.0.1:8000/api/v1/memories/search \
-H "Content-Type: application/json" \
-d '{
"query": "database decision timeline",
"namespace": "my-project",
"include_temporal": true
}'
# Store with temporal dual-write (fire-and-forget to Graphiti)
curl -X POST http://127.0.0.1:8000/api/v1/memories/ \
-H "Content-Type: application/json" \
-d '{
"content": "Alice promoted to VP Engineering in Q1 2026.",
"namespace": "my-project",
"type": "decision",
"include_temporal": true
}'from memory_core_sdk import MemoryClient
with MemoryClient(namespace="my-project") as mem:
# Store
m = mem.store(
"We standardised on Python 3.11+ across all services.",
type="decision",
tags=["python", "standards"],
)
# Search
results = mem.search("python version decision", top_k=5)
for r in results.memories:
print(r.content, r.score)
# Temporal search (requires Graphiti backend)
temporal = mem.search("python standards timeline", include_temporal=True)Async:
from memory_core_sdk import AsyncMemoryClient
async with AsyncMemoryClient(namespace="my-project") as mem:
await mem.store("Async agent completed task.")
results = await mem.search("agent task")Copy claude-mcp-config.example.json to claude-mcp-config.json, fill in your paths, and add it to your Claude Code settings:
{
"mcpServers": {
"memory-core": {
"command": "python",
"args": ["-m", "memory_core_mcp"],
"cwd": "/path/to/memory-core",
"env": {
"PYTHONPATH": "/path/to/memory-core/packages",
"MEMORY_CORE_URL": "http://127.0.0.1:8000",
"MEMORY_CORE_NS": "default"
}
}
}
}Available tools: memory_store, memory_search, memory_recall, memory_delete, memory_health.
Memory Core is actively integrated with evan_core — a unified AI control kernel. The EvanMemory wrapper provides agent-scoped memory with shorthand methods aligned to the agent task lifecycle:
from memory.evan_memory import EvanMemory
with EvanMemory(namespace="evan_core", agent_id="planner") as mem:
# Pre-task context loading
prior = mem.recall("API rate limit strategy", top_k=5)
# Store learnings during execution
mem.remember(
"OpenAI API: 429 at 100 req/min — added exponential backoff.",
type="learning",
tags=["api", "rate-limit"],
episode_id="task-2026-03-31",
)
# Record a temporal milestone
mem.record_event(
"Task completed. Centralised API clients in services/api_clients.py.",
type="decision",
temporal=True,
)
# Query the temporal timeline (via Graphiti)
timeline = mem.temporal_search("API client architecture decisions")84 tests passing
| Suite | Tests | Notes |
|---|---|---|
tests/unit/ |
5 | Router logic, no external deps |
tests/integration/test_sdk.py |
11 | Requires API + mcp-memory-service |
tests/integration/test_api_e2e.py |
10 | Requires API + mcp-memory-service |
tests/integration/test_mcp_server.py |
11 | Requires API + mcp-memory-service |
tests/integration/test_mcp_memory_adapter.py |
12 | Requires mcp-memory-service |
tests/integration/test_graphiti_adapter.py |
3+ | Requires FalkorDB (requires_llm for LLM tests) |
tests/integration/test_temporal_routing.py |
17 | Requires API; Ollama tests: requires_ollama |
tests/integration/test_evan_core_integration.py |
14 | Requires API + EVAN_CORE_PATH |
Run unit tests (no external services):
PYTHONPATH=packages python -m pytest tests/unit/ -vRun integration tests (requires running API):
PYTHONPATH=packages python -m pytest tests/integration/ \
-m "not requires_llm and not requires_ollama" -vCurrent: v0.1.0 — usable, integration-tested, running in production internally.
Limitations:
- Temporal dual-write is fire-and-forget; there is no confirmation that Graphiti has finished processing before the response is returned
GraphitiAdapter.delete()is not implemented (use Graphiti's graph maintenance APIs directly)GraphitiAdapter.get()(retrieve by ID) is not implemented; falls back to primary- FalkorDB
group_id(namespace) must not contain hyphens — they are auto-sanitized to underscores internally - Cognee adapter is present but not verified end-to-end in this release
Next steps:
- More backend adapters (Qdrant-native, Mem0, custom)
- Recall quality improvements (re-ranking, hybrid dense/sparse)
- Knowledge graph ingestion from documents and conversations
- Namespace permission model
MIT — see LICENSE.