Skip to content

fix: clean up graph store data on Memory.delete()#4505

Merged
whysosaket merged 2 commits intomainfrom
fix/graph-cleanup-on-memory-delete
Mar 23, 2026
Merged

fix: clean up graph store data on Memory.delete()#4505
whysosaket merged 2 commits intomainfrom
fix/graph-cleanup-on-memory-delete

Conversation

@utkarsh240799
Copy link
Copy Markdown
Contributor

Linked Issue

Closes #3245

Description

When deleting a memory via Memory.delete(memory_id), the graph store (Neo4j, Memgraph, Kuzu, Neptune, Apache AGE) was not cleaned up — only the vector store and history DB were updated. This led to orphaned graph nodes and relationships accumulating over time, causing data inconsistency, storage bloat, and stale query results.

Root cause: _delete_memory() only handled vector store deletion + history recording. No graph backend had a delete() method for single-memory cleanup.

Fix: Add a delete(data, filters) method to every graph backend that reuses the existing entity extraction pipeline (_retrieve_nodes_from_data_establish_nodes_relations_from_data_delete_entities) to identify and remove graph relationships associated with the deleted memory text. Hook this into Memory.delete() and AsyncMemory.delete().

Key design decisions

  • No new Cypher queries — reuses each backend's existing _delete_entities() method, preserving per-backend semantics (Neo4j soft-deletes via r.valid=false, all others hard-delete via DELETE r)
  • No double vector store fetchdelete() fetches the memory once and passes it to _delete_memory() via a new optional existing_memory parameter (backward-compatible default None)
  • Resilient — graph cleanup is wrapped in try/except at both the Memory level and graph backend level, so failures never block vector store deletion
  • delete_all() unchanged — continues to use bulk graph.delete_all(), no per-memory graph cleanup (avoids N×LLM calls)
  • No interference with add() pipeline_delete_memory() (called during implicit LLM-driven deletes in _add_to_vector_store) does NOT do graph cleanup, because the graph pipeline (_add_to_graph) runs in parallel and manages its own conflict resolution independently

Files changed

File Change
mem0/memory/graph_memory.py Add delete(data, filters) — Neo4j, soft-delete
mem0/memory/memgraph_memory.py Add delete(data, filters) — Memgraph, hard-delete
mem0/memory/kuzu_memory.py Add delete(data, filters) — Kuzu, hard-delete
mem0/graphs/neptune/base.py Add delete(data, filters) — Neptune (both DB and Analytics), hard-delete
mem0/memory/apache_age_memory.py Add delete(data, filters) — Apache AGE, hard-delete
mem0/memory/main.py Memory.delete() and AsyncMemory.delete() now perform graph cleanup; _delete_memory() accepts optional pre-fetched memory

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactor (no functional changes)
  • Documentation update

Breaking Changes

N/A

Test Coverage

  • I added/updated unit tests
  • I added/updated integration tests
  • I tested manually (describe below)
  • No tests needed (explain why)

Unit tests (tests/test_graph_delete.py — 15 tests, always run in CI)

Test What it verifies
test_delete_calls_graph_cleanup_when_graph_enabled graph.delete() called with correct text and filters
test_delete_skips_graph_when_not_enabled Zero behavior change when enable_graph=False
test_delete_continues_if_graph_cleanup_fails Graph exception doesn't block vector deletion
test_delete_skips_graph_when_no_user_id No graph call when memory lacks user_id
test_delete_skips_graph_when_no_memory_text No graph call when memory has no text data
test_delete_passes_all_filters_to_graph user_id, agent_id, run_id all forwarded
test_async_delete_calls_graph_cleanup Async variant works correctly
test_async_delete_continues_if_graph_cleanup_fails Async resilience on graph failure
test_delete_raises_for_nonexistent_memory_with_graph_enabled ValueError raised, graph untouched
test_async_delete_raises_for_nonexistent_memory_with_graph_enabled Async variant of above
test_delete_all_does_not_trigger_per_memory_graph_cleanup delete_all() uses bulk graph.delete_all() only
test_internal_delete_memory_does_not_trigger_graph_cleanup _delete_memory() does NOT call graph.delete() (safe for parallel add() pipeline)
test_graph_memory_delete_calls_internal_methods Neo4j MemoryGraph.delete() calls the correct internal pipeline
test_graph_memory_delete_skips_when_no_entities No-op when LLM extracts zero entities
test_graph_memory_delete_handles_exception Backend-level exception caught and logged

Kuzu e2e tests (tests/test_graph_delete_e2e.py — 14 tests, skipped if kuzu not installed)

Real Kuzu embedded database with mocked LLM/embedder:

Test What it verifies
test_add_creates_nodes_and_edges Baseline: add() creates real graph data
test_delete_removes_edges_created_by_add delete() removes relationships from real DB
test_delete_only_removes_matching_edges Selective: only targeted relationships removed
test_delete_with_different_user_id User isolation at graph level
test_delete_nonexistent_relationship_is_safe No-op on missing data
test_delete_with_llm_failure_does_not_raise LLM crash handled gracefully
test_delete_with_empty_entity_extraction Empty LLM response → graph untouched
test_delete_all_removes_everything_for_user Bulk delete baseline
test_add_delete_add_cycle Add → delete → re-add works
test_memory_delete_triggers_graph_cleanup Full Memory.delete() stack cleans graph
test_memory_delete_with_graph_preserves_other_users_data User isolation at Memory API level
test_memory_delete_graph_failure_still_deletes_vector Graph failure → vector deletion proceeds
test_memory_delete_all_uses_bulk_not_per_memory delete_all() uses graph.delete_all()
test_memory_delete_nonexistent_raises_without_graph_side_effects Non-existent → ValueError, graph untouched

Docker e2e tests (tests/test_graph_delete_docker.py — 23 tests, skipped if Docker DBs unavailable)

Tests against real database instances with mocked LLM/embedder. All auto-skip in CI.

Backend Docker image Tests Delete semantics verified
Neo4j neo4j:5.23 6 Soft-delete (r.valid=false) confirmed
Memgraph memgraph/memgraph 6 Hard-delete (DELETE r) confirmed
Apache AGE apache/age 6 Hard-delete (DELETE r) confirmed
Neptune Neo4j as OpenCypher proxy 5 Hard-delete + user_id string type assertion

Each backend tests: add, delete, selective delete, user isolation, delete_all, add-delete-add cycle.

Manual testing

All 23 Docker tests were run locally against real Neo4j 5.23, Memgraph, and Apache AGE containers (all passed). Neptune was tested via Neo4j since it uses the same OpenCypher query language.

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have added tests that prove my fix/feature works
  • New and existing tests pass locally
  • I have updated documentation if needed

🤖 Generated with Claude Code

When deleting a memory via Memory.delete(), the graph store (Neo4j,
Memgraph, Kuzu, Neptune, Apache AGE) was not cleaned up — only the
vector store and history DB were. This led to orphaned graph nodes and
relationships accumulating over time.

Add a delete(data, filters) method to every graph backend that reuses
the existing entity extraction pipeline (_retrieve_nodes_from_data →
_establish_nodes_relations_from_data → _delete_entities) to identify
and remove graph relationships associated with the deleted memory.
Hook this into Memory.delete() and AsyncMemory.delete(), with
try/except so graph failures never block vector store deletion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@kartik-mem0
Copy link
Copy Markdown
Contributor

hey @utkarsh240799 please address the failing ci

…assertion

- Docker e2e tests now do a fast TCP socket check before attempting
  database connections, so they skip instantly in CI where no Docker
  containers are running (previously the driver-level timeout was too
  slow or the connection returned a Mock)
- Fix test_main.py::test_delete to expect the new existing_memory
  second argument passed to _delete_memory()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@whysosaket whysosaket merged commit 5332741 into main Mar 23, 2026
7 checks passed
@whysosaket whysosaket deleted the fix/graph-cleanup-on-memory-delete branch March 23, 2026 13:57
jamebobob pushed a commit to jamebobob/mem0-vigil-recall that referenced this pull request Mar 29, 2026
Co-authored-by: utkarsh240799 <utkarsh240799@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory deletion does not clean up Neo4j graph data

3 participants