Skip to content

Fix/windows hook stdio utf8#1280

Closed
yangshare wants to merge 4 commits into
MemPalace:developfrom
yangshare:fix/windows-hook-stdio-utf8
Closed

Fix/windows hook stdio utf8#1280
yangshare wants to merge 4 commits into
MemPalace:developfrom
yangshare:fix/windows-hook-stdio-utf8

Conversation

@yangshare

Copy link
Copy Markdown

这个修复把 #363/#400 的 Windows UTF-8 stdin 处理扩展到 Claude/Codex hook 入口,避免非 ASCII hook payload 在 json.load(sys.stdin) 前被系统 ANSI codepage
解码坏。

- 添加 mempalace.yaml 到 .gitignore
- 添加 entities.json 到 .gitignore
- 为 MemPalace 项目文件添加注释说明
- 解决 issue MemPalace#185 中提到的问题
yangshare added 2 commits May 1, 2026 11:21
- 实现了读取工具:状态查询、翅膀/房间列表、分类获取、语义搜索、重复检查
- 实现了写入工具:抽屉添加/删除/更新、知识图谱操作、代理日记功能
- 集成了 ChromaDB 后端和知识图谱存储
- 添加了写前日志(WAL)用于审计和回滚追踪
- 实现了向量搜索容量检测和禁用机制以防止崩溃
- 添加了标准输入输出保护避免 JSON-RPC 协议损坏
- 实现了缓存机制和文件系统变更检测以保持数据一致性
- 添加了 AAAK 记忆方言规范和宫殿协议定义
- 实现了完整的 ChromaCollection 适配器类,提供标准化的数据库操作接口
- 添加了 HNSW 索引优化配置,防止大规模数据插入时的 link_lists.bin 文件膨胀
- 实现了 HNSW 段健康检查和隔离机制,自动检测并隔离损坏的索引段
- 添加了安全的 pickle 反序列化机制,防止恶意文件执行任意代码
- 实现了 BLOB 序列ID到整数的迁移修复,解决 ChromaDB 0.6.x 到 1.5.x 升级问题
- 添加了向量搜索容量状态检查,预防 HNSW 索引与 SQLite 数据库之间的数据不一致
- 实现了客户端缓存和文件系统新鲜度检查,确保重建后能检测到新的数据库状态
- 添加了多线程安全的 HNSW 配置修复,防止并发写入时的竞争条件问题
@igorls igorls added area/hooks Claude Code hook scripts (Stop, PreCompact, SessionStart) area/windows Windows-specific bugs and compatibility labels May 2, 2026
@igorls igorls added this to the v3.3.5 milestone May 2, 2026
@igorls

igorls commented May 2, 2026

Copy link
Copy Markdown
Member

Thanks @yangshare — this is the strict superset of the Windows UTF-8 stdio fixes and we'd like to make it the canonical for mcp_server + hooks_cli (closing #1259 as a duplicate pointing here). Two requests before merge into v3.3.5:

1. Please split out the unrelated changes into a separate PR:

  • mempalace/backends/chroma.py HNSW sync_threshold adjustments
  • .gitignore edits

These are legitimate but don't belong in a stdio-encoding fix. A clean diff makes review and revert much safer.

2. After the split, rebase against current develop to pick up #1303/#1299/#1289 changes to _get_collection.

Note: The CLI / fact_checker reconfigure side is being landed via #1282 (no overlap with this PR — they're complementary). Together they close out #1241/#1242/#1122/#1296.

Once split + rebased, I'll authorize CI on the fork and we can merge.

@yangshare yangshare marked this pull request as draft May 3, 2026 03:24
@yangshare yangshare closed this May 3, 2026
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 3, 2026
The `python -m mempalace.fact_checker --stdin` entry point reads non-ASCII
text through the system ANSI codepage (cp1252/cp1251/cp950) on Windows,
which mojibakes characters before claim-extraction sees them. Reconfigure
stdin/stdout/stderr to UTF-8 with `errors="strict"`, wrapped in try/except
so a replaced stream (Jupyter, test harness) logs a warning rather than
crashing the CLI.

Mirrors the same fix shipped for `mcp_server.py:main()` (MemPalace#400) and
`hooks_cli.py:run_hook()` (MemPalace#1280) -- this is the third and last
stdin-reading entry point in the package.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/hooks Claude Code hook scripts (Stop, PreCompact, SessionStart) area/windows Windows-specific bugs and compatibility

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants