Problem
The palace is great at organizing what you know — facts, events, preferences, advice across wings and rooms. But it doesn't track what happened when a memory was used. If a closet or drawer gets retrieved during a conversation and the user says "no, that's wrong" or "yes, exactly", that signal disappears.
Over time this means retrieval can't distinguish between memories that consistently help and ones that consistently mislead.
Proposal
Add an episode layer that records retrieval-and-outcome pairs:
situation → what memories were retrieved → what action/answer resulted → how the user responded
Each episode carries a utility score (0.0–1.0) that adjusts based on feedback:
FEEDBACK_DELTAS = {
"confirmed": +0.2, # user said this was right
"corrected": -0.1, # user tweaked it — still partially useful
"rejected": -0.3, # user said no
"undone": -0.4, # action was taken and user had to reverse it
"ignored": -0.05, # no engagement — weak signal
}
new_utility = clamp(old_utility + delta, 0.0, 1.0)
The key asymmetry: undone > rejected. If the system acted on a memory and the user had to clean up, that's worse than the user catching it before anything happened. This distinction matters for agentic use cases where specialist agents take actions.
What an episode captures
- Situation: domain, topic, one-line summary
- Context snapshot: time of day, day of week, which preferences/patterns were active — because the same query at 9am vs 11pm may have different right answers
- Retrieved memories: which drawers/closets were pulled
- Outcome: what happened as a result
- Feedback: user response + utility delta
How it helps retrieval
Episodes become a signal in retrieval ranking. When multiple drawers match a query, prefer ones linked to episodes with high utility. When a drawer is consistently linked to rejected episodes, demote it. A simple weighted score:
relevance = (
0.3 * domain_match
+ 0.3 * topic_match
+ 0.2 * utility_score # from episode history
+ 0.1 * (created < 7 days)
+ 0.1 * (created < 24 hours)
)
This biases toward recent, topic-matched, proven-useful memories.
Where it fits in the palace
Episodes aren't drawers — they don't contain facts. They sit alongside the palace as a parallel store, linked to drawers by ID. Think of them as the palace's journal: "I used this memory in this situation and it went well/badly."
Reference
We built this in a TypeScript port. The core is ~200 lines and storage-agnostic. Happy to help with a Python version if this direction interests you.
Problem
The palace is great at organizing what you know — facts, events, preferences, advice across wings and rooms. But it doesn't track what happened when a memory was used. If a closet or drawer gets retrieved during a conversation and the user says "no, that's wrong" or "yes, exactly", that signal disappears.
Over time this means retrieval can't distinguish between memories that consistently help and ones that consistently mislead.
Proposal
Add an episode layer that records retrieval-and-outcome pairs:
Each episode carries a utility score (0.0–1.0) that adjusts based on feedback:
The key asymmetry: undone > rejected. If the system acted on a memory and the user had to clean up, that's worse than the user catching it before anything happened. This distinction matters for agentic use cases where specialist agents take actions.
What an episode captures
How it helps retrieval
Episodes become a signal in retrieval ranking. When multiple drawers match a query, prefer ones linked to episodes with high utility. When a drawer is consistently linked to rejected episodes, demote it. A simple weighted score:
This biases toward recent, topic-matched, proven-useful memories.
Where it fits in the palace
Episodes aren't drawers — they don't contain facts. They sit alongside the palace as a parallel store, linked to drawers by ID. Think of them as the palace's journal: "I used this memory in this situation and it went well/badly."
Reference
We built this in a TypeScript port. The core is ~200 lines and storage-agnostic. Happy to help with a Python version if this direction interests you.