[Feature]: Improving Multilingual Memory Extraction in Hermes

### Problem or Use Case

Hermes already has a strong memory architecture, including built-in curated memory (`MEMORY.md` / `USER.md`), SQLite-backed session recall with FTS5, and external providers like Holographic.

The problem is that the current auto-extraction path appears to rely heavily on English-oriented heuristics and regex patterns. That works as a lightweight baseline, but it does not generalize well to Korean and other non-English languages.

For multilingual users, the hard part is not storage but durable-candidate judgment: deciding whether something is a stable preference, a project fact, or just session-local context. In Korean, these signals are often expressed indirectly through discourse context, sentence endings, and paraphrased procedural language rather than patterns like “I prefer ...” or “I always ...”.

As a result, Hermes may miss durable facts in Korean or mixed-language chats, and expanding regex rules per language would increase maintenance cost and false positives.

### Proposed Solution

Instead of letting heuristics directly decide persistence, Hermes could add an optional LLM-guided candidate extraction stage before persistence.

Suggested pipeline:

conversation history
→ LLM extracts structured memory candidates
→ policy filter checks durability / sensitivity / confidence / scope
→ route to the right store
- built-in memory for compact, high-value stable facts
- Holographic for deeper structured facts

The key principle would be: “The LLM proposes. Hermes decides.”

This keeps extraction separate from persistence and should improve multilingual memory quality without requiring large language-specific rule sets.

A small MVP could be:
1. Add optional LLM candidate extraction at session end
2. Require strict JSON output
3. Add policy-gated persistence
4. Route candidates into `built_in_user`, `built_in_memory`, `holographic`, or `discard`
5. Keep the current heuristic path as fallback

Additional Context
I originally planned to open this as a GitHub Discussion because the repository documentation mentions Discussions for design proposals and architecture discussions. However, the Discussions route currently appears unavailable from the repository UI, so I am opening it as an Issue instead.

If useful, I can also provide:
- Korean test conversations
- expected candidate extraction examples
- a longer RFC-style version

### Alternatives Considered

I considered extending the current regex / heuristic extraction path, but that seems likely to increase maintenance cost and still perform poorly for Korean and other multilingual cases. I also considered keeping everything provider-specific, but the candidate-extraction pattern seems broadly useful across Hermes memory backends.

### Feature Type

Performance / reliability

### Scope

Medium (few files, < 300 lines)

### Contribution

- [ ] I'd like to implement this myself and submit a PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Improving Multilingual Memory Extraction in Hermes #9135

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: Improving Multilingual Memory Extraction in Hermes #9135

Description

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions