feat(session): add memory cold-storage archival via hotness scoring#620
Conversation
Add MemoryArchiver that moves cold memories (below a configurable
hotness threshold) to an archive directory, reducing token consumption
from stale abstracts and overviews during retrieval.
- scan() queries vector index for L2 memories and computes hotness scores
- archive() moves cold memories to {parent}/_archive/ via viking_fs.mv()
- restore() recovers archived memories to their original location
- Respects min_age_days to avoid archiving recent memories
- Skips L0/L1 files (abstracts and overviews are never archived)
- Includes dry-run mode and scan_and_archive convenience method
- 30 unit tests covering scan, archive, restore, edge cases
This contribution was developed with AI assistance (Claude Code).
codeCraft-Ritik
left a comment
There was a problem hiding this comment.
Great work! The Python implementation is clean and easy to understand.
qin-ctx
left a comment
There was a problem hiding this comment.
Thanks for the well-structured PR and thorough test coverage. Left one design suggestion on scan filtering efficiency.
| now = datetime.now(timezone.utc) | ||
|
|
||
| candidates: List[ArchivalCandidate] = [] | ||
|
|
There was a problem hiding this comment.
[Design] (non-blocking) The server-side filter here only uses Eq("level", 2), then all other filtering (scope prefix, _archive exclusion, min age) is done client-side in Python. For large memory stores this pulls far more records than necessary.
expr.py already provides filters that can push these checks to the server:
from openviking.storage.expr import And, Eq, PathScope
filter_expr = And(conds=[
Eq("level", 2),
PathScope(prefix=scope_uri), # scope filtering
# Could also add a TimeRange on updated_at for min_age_days,
# and exclude _archive paths if the backend supports it.
])Pushing scope, time range, and archive-exclusion filters to the server would reduce data transfer and client-side iteration significantly.
There was a problem hiding this comment.
Good catch on the filter push-down. The PathScope and Eq filters would reduce the data transferred significantly for large stores. I'll update to push scope prefix and _archive exclusion to the server side.
|
Does the archiving operation need to take into account possible links between file contents, and will it have any impact on future linking mechanisms? @qin-ctx |
|
Good question. The current implementation archives memories independently based on hotness scoring - it doesn't trace cross-references between memories. If a future linking mechanism is added (e.g., memory-to-memory references), archived memories would need a resolution step during retrieval to check cold storage for linked items. For now, the archival is reversible - |
|
Thanks for your contribution! The archiving idea is great. Looking forward to seeing further refinements on this design. |
Summary
Add a
MemoryArchiverthat moves cold memories (below a configurable hotness threshold) to an archive directory, reducing token consumption from stale abstracts and overviews during retrieval. Non-destructive: archived memories can be restored.Problem Statement
The
hotness_score()function inmemory_lifecycle.pyandactive_counttracking insession.commit()are already in place, but memories never get cleaned up. Over time, stale memories accumulate, wasting tokens when parent directories regenerate abstracts/overviews.Evidence
active_count(this PR builds on it)Proposed Solution
MemoryArchiverprovides three operations:scan(scope_uri)- Queries the vector index for all L2 memories under a scope, computeshotness_score()for each, returns those below the threshold and older thanmin_age_days.archive(candidates)- Moves cold memories to{parent}/_archive/viaviking_fs.mv(), which atomically updates the vector index. Supportsdry_run=True.restore(archived_uri)- Moves an archived memory back to its original location by removing the_archive/path segment.Key design decisions:
Changes
openviking/session/memory_archiver.py(328 lines) -MemoryArchiverclass with scan/archive/restoretests/unit/session/test_memory_archiver.py(423 lines) - 30 unit testsopenviking/session/__init__.py- ExportMemoryArchiver,ArchivalCandidate,ArchivalResultTesting
All 30 unit tests pass. Tests cover:
Future Work
This PR provides the core archival mechanism. Follow-up work could include:
ov memory archive/ov memory restoresession.commit()for auto-archivalThe
MemoryArchiverbuilds onhotness_score()inmemory_lifecycle.pyand theactive_counttracking wired intosession.commit(). It usesviking_fs.mv()to stay within the filesystem paradigm rather than introducing a new deletion mechanism.This contribution was developed with AI assistance (Claude Code).
Relates to #269, #578, #350