Problem or Use Case
Hermes already has a strong memory architecture, including built-in curated memory (MEMORY.md / USER.md), SQLite-backed session recall with FTS5, and external providers like Holographic.
The problem is that the current auto-extraction path appears to rely heavily on English-oriented heuristics and regex patterns. That works as a lightweight baseline, but it does not generalize well to Korean and other non-English languages.
For multilingual users, the hard part is not storage but durable-candidate judgment: deciding whether something is a stable preference, a project fact, or just session-local context. In Korean, these signals are often expressed indirectly through discourse context, sentence endings, and paraphrased procedural language rather than patterns like “I prefer ...” or “I always ...”.
As a result, Hermes may miss durable facts in Korean or mixed-language chats, and expanding regex rules per language would increase maintenance cost and false positives.
Proposed Solution
Instead of letting heuristics directly decide persistence, Hermes could add an optional LLM-guided candidate extraction stage before persistence.
Suggested pipeline:
conversation history
→ LLM extracts structured memory candidates
→ policy filter checks durability / sensitivity / confidence / scope
→ route to the right store
- built-in memory for compact, high-value stable facts
- Holographic for deeper structured facts
The key principle would be: “The LLM proposes. Hermes decides.”
This keeps extraction separate from persistence and should improve multilingual memory quality without requiring large language-specific rule sets.
A small MVP could be:
- Add optional LLM candidate extraction at session end
- Require strict JSON output
- Add policy-gated persistence
- Route candidates into
built_in_user, built_in_memory, holographic, or discard
- Keep the current heuristic path as fallback
Additional Context
I originally planned to open this as a GitHub Discussion because the repository documentation mentions Discussions for design proposals and architecture discussions. However, the Discussions route currently appears unavailable from the repository UI, so I am opening it as an Issue instead.
If useful, I can also provide:
- Korean test conversations
- expected candidate extraction examples
- a longer RFC-style version
Alternatives Considered
I considered extending the current regex / heuristic extraction path, but that seems likely to increase maintenance cost and still perform poorly for Korean and other multilingual cases. I also considered keeping everything provider-specific, but the candidate-extraction pattern seems broadly useful across Hermes memory backends.
Feature Type
Performance / reliability
Scope
Medium (few files, < 300 lines)
Contribution
Problem or Use Case
Hermes already has a strong memory architecture, including built-in curated memory (
MEMORY.md/USER.md), SQLite-backed session recall with FTS5, and external providers like Holographic.The problem is that the current auto-extraction path appears to rely heavily on English-oriented heuristics and regex patterns. That works as a lightweight baseline, but it does not generalize well to Korean and other non-English languages.
For multilingual users, the hard part is not storage but durable-candidate judgment: deciding whether something is a stable preference, a project fact, or just session-local context. In Korean, these signals are often expressed indirectly through discourse context, sentence endings, and paraphrased procedural language rather than patterns like “I prefer ...” or “I always ...”.
As a result, Hermes may miss durable facts in Korean or mixed-language chats, and expanding regex rules per language would increase maintenance cost and false positives.
Proposed Solution
Instead of letting heuristics directly decide persistence, Hermes could add an optional LLM-guided candidate extraction stage before persistence.
Suggested pipeline:
conversation history
→ LLM extracts structured memory candidates
→ policy filter checks durability / sensitivity / confidence / scope
→ route to the right store
The key principle would be: “The LLM proposes. Hermes decides.”
This keeps extraction separate from persistence and should improve multilingual memory quality without requiring large language-specific rule sets.
A small MVP could be:
built_in_user,built_in_memory,holographic, ordiscardAdditional Context
I originally planned to open this as a GitHub Discussion because the repository documentation mentions Discussions for design proposals and architecture discussions. However, the Discussions route currently appears unavailable from the repository UI, so I am opening it as an Issue instead.
If useful, I can also provide:
Alternatives Considered
I considered extending the current regex / heuristic extraction path, but that seems likely to increase maintenance cost and still perform poorly for Korean and other multilingual cases. I also considered keeping everything provider-specific, but the candidate-extraction pattern seems broadly useful across Hermes memory backends.
Feature Type
Performance / reliability
Scope
Medium (few files, < 300 lines)
Contribution