Skip to content

feat: time-decay scoring for search results — prioritize recent memories (#331)#337

Open
matrix9neonebuchadnezzar2199-sketch wants to merge 2 commits into
MemPalace:developfrom
matrix9neonebuchadnezzar2199-sketch:feat/time-decay-scoring
Open

feat: time-decay scoring for search results — prioritize recent memories (#331)#337
matrix9neonebuchadnezzar2199-sketch wants to merge 2 commits into
MemPalace:developfrom
matrix9neonebuchadnezzar2199-sketch:feat/time-decay-scoring

Conversation

@matrix9neonebuchadnezzar2199-sketch

Copy link
Copy Markdown

Closes #331

Summary

Add time-decay scoring so recent memories rank higher than older ones in search results.

Problem

mempalace_search ranks results purely by ChromaDB vector similarity. A tech-stack decision from six months ago and a discussion from yesterday are treated with equal weight. AI agents retrieving outdated decisions as top results can act on stale context without realizing it.

Solution

Decay formula

decay = 0.5 ^ (age_days / half_life_days) final_score = similarity * decay

With the default 90-day half-life: yesterday's memory keeps ~99% of its score, 90-day-old memory drops to 50%, 180-day-old drops to 25%.

Config layer (config.py)

  • time_decay_half_life_days property (default: 90, set to 0 to disable)

Search layer (searcher.py)

  • _apply_time_decay(): standalone function that re-ranks hits by applying exponential decay to similarity scores using the filed_at metadata timestamp
  • search_memories() accepts time_decay=True (default) to enable/disable
  • Results include decay, original_similarity, and adjusted similarity fields
  • Missing or invalid filed_at timestamps receive zero penalty (decay=1.0)

MCP layer (mcp_server.py)

  • mempalace_search: new time_decay boolean parameter (default true)
  • Agents can pass time_decay=false for historical research queries

Non-breaking

  • Default behavior changes ranking order but returns the same data
  • time_decay=false restores exact previous behavior
  • No schema migration needed — uses existing filed_at metadata

Testing

  • 10 new tests in tests/test_time_decay.py covering:
    • Recent vs old ranking reversal
    • Half-life precision (decay ≈ 0.5 at exactly half_life_days)
    • Zero/negative half-life disables decay
    • Missing and invalid filed_at handling
    • original_similarity preservation
    • Sort order correctness
    • Empty results
  • All existing tests pass (99 passed, 2 pre-existing Windows-only failures unrelated to this change)

Related: #332 (soft-archive wings — the other half of time-aware memory management)

@web3guru888 web3guru888 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid implementation of #331. A few observations from our integration perspective:

What works well:

  • Exponential decay with configurable half-life is the right model. 90-day default is reasonable.
  • _apply_time_decay() preserving original_similarity and decay fields is clean — lets consumers decompose the score.
  • Graceful fallback for missing/invalid filed_at (age=0, no penalty) avoids breaking old memories that lack timestamps.
  • time_decay: bool parameter on tool_search is a good escape hatch.

Integration considerations:

  • We use search_memories() directly in our OODA pipeline. With time_decay=True as the new default, we'll get decay applied automatically on next upgrade. For our Orient (breadth) phase, this could actually be beneficial — cross-domain discovery benefits from recency bias. For Evaluate (precision), we may want time_decay=False to preserve pure similarity ranking. The boolean toggle gives us that control.
  • This composes well with the RetrievalProfile concept from #335 — decay becomes a per-profile parameter alongside k, wing filters, and similarity thresholds.
  • Our tiered dedup (check_duplicate with hardcoded thresholds) is unaffected since decay only applies to search_memories(), not the raw ChromaDB similarity in dedup.

One suggestion: The time_decay field in the response dict is a bool, but callers might want to know the half-life that was applied (especially if it came from config). Consider adding half_life_days to the response metadata alongside time_decay: true.

- Addresses @web3guru888 suggestion to expose the applied half-life
- Response now includes half_life_days (int) when time_decay is true, None when false

Made-with: Cursor
@matrix9neonebuchadnezzar2199-sketch

Copy link
Copy Markdown
Author

@web3guru888 Good suggestion — implemented in dd4aa38.

search_memories() response now includes half_life_days alongside time_decay:

  • time_decay=True"half_life_days": 90 (or whatever the configured value is)
  • time_decay=False"half_life_days": null

This way your OODA pipeline can inspect which half-life was actually applied without needing to read the config separately. Useful for logging/debugging when different profiles use different settings.

@web3guru888

Copy link
Copy Markdown

Perfect — having half_life_days in the response metadata is exactly what we need for OODA phase logging.

Our Orient pass runs time_decay=True, half_life_days=90 and our Evaluate pass runs time_decay=False. Having the actual applied value in the response means we can log which profile was active without a separate config lookup. Clean separation of concerns.

Will wire this into our session telemetry immediately.

@bensig bensig changed the base branch from main to develop April 11, 2026 22:22
@igorls igorls added area/mcp MCP server and tools area/search Search and retrieval enhancement New feature or request labels Apr 14, 2026
jphein referenced this pull request in techempower-org/mempalace Apr 19, 2026
Scanned all 233 open upstream PRs today against our open PRs and
fork-ahead / planned-work items. Findings merged into README:

- P2 (decay) and P3 Tier-0 (LLM rerank): both covered by MemPalace#1032
  (@zackchiutw, MERGEABLE, 2026-04-19 — Weibull decay + 4-stage
  rerank pipeline). Older simpler version at #337. Dropped as
  fork work; watching MemPalace#1032.
- P7 (alternative storage): formally out of scope. RFC 001 MemPalace#743
  (@igorls) defines the plugin contract; four backend PRs already
  in flight (MemPalace#700, MemPalace#381 Qdrant; MemPalace#574, MemPalace#575 LanceDB). Fork consumes,
  does not rebuild.
- P0 (multi-label tags): still fork/upstream candidate. MemPalace#1033
  (@zackchiutw) ships adjacent privacy-tag + progressive disclosure
  but not the full multi-label scheme.
- Merged MemPalace#1023 section acknowledges complementary MemPalace#976 (felipetruman)
  which adds broader mine_global_lock() + HNSW num_threads pin.

Gives future-us a map so we don't re-file MemPalace#1036-style duplicates.
@igorls

igorls commented May 8, 2026

Copy link
Copy Markdown
Member

Hi, thanks for the contribution.

This PR has merge conflicts with develop, and the branch has not been updated in over 7 days, which puts it before our most recent release. The conflicts are likely against work that landed in that release.

Could you rebase onto develop so we can take another look?

If this change is no longer relevant, feel free to close the PR.

(This message is part of a periodic backlog pass, sent to all open PRs that match this state.)

@igorls igorls added the needs-rebase PR has merge conflicts with develop and needs rebase label May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/mcp MCP server and tools area/search Search and retrieval enhancement New feature or request needs-rebase PR has merge conflicts with develop and needs rebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: time-decay scoring for search results — prioritize recent memories

3 participants