Skip to content

feat: reference library with simplified picker and local-only Zotero#61

Merged
Zhang-Henry merged 3 commits intomainfrom
refactor/simplify-references-picker-remove-cloud-zotero
Mar 26, 2026
Merged

feat: reference library with simplified picker and local-only Zotero#61
Zhang-Henry merged 3 commits intomainfrom
refactor/simplify-references-picker-remove-cloud-zotero

Conversation

@Zhang-Henry
Copy link
Copy Markdown
Collaborator

@Zhang-Henry Zhang-Henry commented Mar 21, 2026

Summary

  • Add reference management: library panel, BibTeX import, local Zotero sync, collection browser, and chat reference picker
  • ReferencePicker only fetches project-scoped references — removed global fetch, dedup logic, server-side search, and "Project" badge
  • Removed cloud Zotero Web API: no more ZoteroWebClient, no credential lookups in routes, no Zotero credentials UI in Settings
  • i18n for en, zh-CN, ko

Add reference management system:
- Reference library panel with project linking, import, and search
- BibTeX file import and local Zotero desktop sync
- Reference picker for attaching references to chat messages
- Zotero collection browser for selective import

ReferencePicker only shows project-scoped references (no global fetch/dedup).
Zotero integration is local-only — removed cloud Web API fallback and
credentials UI from Settings, keeping the setup simple.

i18n: en, zh-CN, ko translations for all reference strings.
Copy link
Copy Markdown
Collaborator

@liuyixin-louis liuyixin-louis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

代码审查 — PR #61: 参考文献库 + 本地 Zotero 集成

对全部 39 个变更文件做了静态分析 + 启动服务器后做了实际 API 测试。


🔴 严重 (3) — 合并前必须修复

1. GET /:idGET /:id/pdf 缺少鉴权检查

  • server/routes/references.js:257,293 + server/database/db.js:867
  • getReference(id) 只按 ID 查询,没有 user_id 过滤。任何已登录用户都能读取其他用户的文献详情和下载其缓存的 PDF。
  • 修复建议:getReference() 中加 user_id 参数,或在路由里校验 ref.user_id === req.user.id

2. PDF 缓存路径遍历漏洞

  • server/routes/references.js:262,277
  • path.join(PDF_CACHE_DIR, \${ref.id}.pdf`) — 文献 ID 包含了 BibTeX citation key(db.js:802),而 citation key 是用户可控的。攻击者可以构造 ../../.ssh/authorized_keys这样的 key,通过fs.createReadStream任意读文件,或通过fs.writeFileSync` 任意写文件。
  • 修复建议:ref.id 做路径清洗(去除路径分隔符、用哈希替代、或校验格式)。

3. 项目文献关联接口缺少鉴权

  • server/routes/references.js:199-232
  • GET/POST/DELETE /project/:projectName/:id 没有验证当前用户是否拥有该项目。任何已登录用户都能查看、关联、取消关联任意项目的文献。
  • API 测试已确认:getProjectReferences() 返回结果时不做用户过滤。

🟠 高 (4) — 建议合并前修复

4. BibTeX 解析器不处理转义引号

  • server/utils/parsers/bibtex-parser.js:125-130
  • title = "The \"Best\" Approach" 会被截断为 The \。在真实 BibTeX 文件中很常见,会造成静默数据损坏。

5. cleanLatex 波浪号替换缺少全局标志

  • server/utils/parsers/bibtex-parser.js:45/~/ 应改为 /~/g
  • J.~K.~RowlingJ. K.~Rowling(只替换了第一个波浪号)。

6. bulkDeleteReferences 无数组大小上限

  • server/routes/references.js:241 + server/database/db.js:973
  • SQLite 的 SQLITE_MAX_VARIABLE_NUMBER 限制为 999。发送超过 999 个 ID 会导致服务器崩溃。路由没有对数组大小做校验。
  • 修复建议:if (ids.length > 500) return 400 或分批删除。

7. 没有 citation key 的 BibTeX 条目重复导入会产生重复记录

  • server/database/db.js:802 — 没有 citationKey 的条目使用 crypto.randomUUID() 作为 ID,ON CONFLICT(id) 永远不会触发 → 每次重新导入都会新增记录。

🟡 中 (8)

8. project_references 表缺少对 projects 的外键 → 项目删除后残留孤立行 (init.sql:144-150)
9. reference_tags 在重新同步时不清理旧标签 — 过时标签永久累积 (db.js:714-765)
10. onReferenceContext 在 ChatComposer 中是可选的 — 用户选了文献、picker 关闭了,但什么也没发生 (ChatComposer.tsx:274-282)
11. ZoteroBrowser 卸载时未清理 setTimeout(1200ms) → 卸载后的状态更新 (ZoteroBrowser.tsx:157-160)
12. collectionKey 查询参数未经清洗直接拼入 Zotero 本地 API URL (zotero-client.js:97-99)
13. Zotero 日期解析:parseInt("15 March 2024") → year=15 而非 2024 (zotero-client.js:42)
14. API 调用中 refId 缺少 URL 编码 (api.js:348-364)
15. 用户拖放非 .bib 文件到 ImportDialog 时没有错误提示 (ImportDialog.tsx:23-31)

🔵 低 (5)

16. ReferenceSearchBar 每次按键都触发 API 请求,没有防抖 (ReferenceSearchBar.tsx:51)
17. navigator.clipboard.writeText 没有 await — 复制失败也会显示"已复制" (ReferenceDetailModal.tsx:23-27)
18. PRAGMA foreign_keys = ON 仅按连接设置,不持久化 (init.sql:2)
19. BibTeX 上传成功后 file input 未重置 — 再次选择同一文件不会触发 onChange (ImportDialog.tsx)
20. Zotero 不可用时"立即同步"按钮没有禁用 (ImportDialog.tsx:94-105)


✅ 做得好的地方

  • SQL 注入:安全 — 全部使用参数化查询
  • BibTeX 导入:格式正确的文件 正常工作
  • 空/畸形 BibTeX:正确拒绝 并返回清晰错误
  • Zotero 错误处理:良好 — 连接拒绝、超时、API 禁用都有友好提示
  • i18n:完整 — en、zh-CN、ko 三语言所有 key 都有
  • 跨用户单条文献 DELETE:已保护(通过 deleteReference 的 user_id 检查返回 404)

测试方法

  • 注册测试用户,通过文件上传导入 BibTeX,测试搜索/分页
  • 在 BibTeX 字段中测试 XSS payload(原始存储但 React 渲染时会转义)
  • 在搜索参数中测试 SQL 注入(安全)
  • 测试跨用户访问单条文献和项目文献
  • Zotero 未运行时测试各 Zotero 接口(优雅错误)
  • 测试批量删除边界情况
  • 在项目名中尝试路径遍历(安全 — 作为数据库字符串使用,非文件系统路径)
  • 对全部 19 个变更文件做静态分析

Copy link
Copy Markdown
Collaborator

@liuyixin-louis liuyixin-louis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #61: Reference Library with Local-Only Zotero

🤖 This review was generated by Claude Code — static analysis of all changed files + live API testing against a running dev server.


🔴 CRITICAL (3) — Must fix before merge

1. Missing authorization on GET /:id and GET /:id/pdf

  • server/routes/references.js:257,293 + server/database/db.js:867
  • getReference(id) queries by ID only, no user_id check. Any authenticated user can read any other user's reference details and download their cached PDFs.
  • Fix: Add user_id parameter to getReference(), or check ref.user_id === req.user.id in route handlers.

2. Path traversal in PDF cache via unsanitized reference ID

  • server/routes/references.js:262,277
  • path.join(PDF_CACHE_DIR, ${ref.id}.pdf) — reference IDs incorporate the BibTeX citation key (db.js:802), which is user-controlled. A crafted citation key like ../../.ssh/authorized_keys allows arbitrary file read via fs.createReadStream or write via fs.writeFileSync.
  • Fix: Sanitize ref.id before path construction (strip path separators, use hash, or validate pattern).

3. Missing authorization on project reference endpoints

  • server/routes/references.js:199-232
  • GET/POST/DELETE /project/:projectName/:id don't verify user ownership. Any authenticated user can view, link, or unlink references on any project.
  • Confirmed via API test: getProjectReferences() returns results without user_id filtering.

🟠 HIGH (4) — Should fix before merge

4. BibTeX parser doesn't handle escaped quotes

  • server/utils/parsers/bibtex-parser.js:125-130
  • title = "The \"Best\" Approach" truncates to The \. Common in real BibTeX files → silent data corruption.

5. cleanLatex tilde replacement missing global flag

  • server/utils/parsers/bibtex-parser.js:45/~/ → should be /~/g
  • J.~K.~RowlingJ. K.~Rowling (only first tilde replaced).

6. bulkDeleteReferences no upper bound on array size

  • server/routes/references.js:241 + server/database/db.js:973
  • SQLite's SQLITE_MAX_VARIABLE_NUMBER (999) limit. Sending >999 IDs crashes the server. No array size validation in the route handler.
  • Fix: Add if (ids.length > 500) return 400 or batch the deletes.

7. Re-import duplicates for BibTeX entries without citation keys

  • server/database/db.js:802 — entries without citationKey get crypto.randomUUID(), so ON CONFLICT(id) never triggers → unbounded duplicates on re-import.

🟡 MEDIUM (8)

8. project_references table missing FK to projects → orphaned rows on project delete (init.sql:144-150)
9. reference_tags never cleaned up on re-sync — stale tags accumulate forever (db.js:714-765)
10. onReferenceContext is optional in ChatComposer — user selects refs, picker closes, nothing happens (ChatComposer.tsx:274-282)
11. ZoteroBrowser leaks setTimeout(1200ms) on unmount → state update on unmounted component (ZoteroBrowser.tsx:157-160)
12. collectionKey query param injected unsanitized into Zotero local API URL (zotero-client.js:97-99)
13. Zotero date parsing: parseInt("15 March 2024") → year=15, not 2024 (zotero-client.js:42)
14. No URL encoding on refId in API calls (api.js:348-364)
15. No error feedback when user drops non-.bib file in ImportDialog (ImportDialog.tsx:23-31)

🔵 LOW (5)

16. ReferenceSearchBar fires API call on every keystroke, no debounce (ReferenceSearchBar.tsx:51)
17. navigator.clipboard.writeText not awaited — shows "Copied!" even on failure (ReferenceDetailModal.tsx:23-27)
18. PRAGMA foreign_keys = ON only set per-connection, not persisted (init.sql:2)
19. File input not reset after successful BibTeX upload — re-selecting same file won't trigger onChange (ImportDialog.tsx)
20. "Sync Now" button not disabled when Zotero is unavailable (ImportDialog.tsx:94-105)


✅ What works well

  • SQL injection: Safe — parameterized queries throughout
  • BibTeX import: Working correctly for well-formed files
  • Empty/malformed BibTeX: Properly rejected with clear errors
  • Zotero error handling: Good — graceful messages for connection refused, timeout, API disabled
  • i18n: Complete — all keys present in en, zh-CN, ko
  • Cross-user single-ref DELETE: Protected (returns 404 via user_id check in deleteReference)

Test methodology

  • Registered test users, imported BibTeX via file upload, tested search/pagination
  • Tested XSS payloads in BibTeX fields (stored raw but React escapes on render)
  • Tested SQL injection in search params (safe)
  • Tested cross-user access on individual refs and project refs
  • Tested Zotero endpoints with Zotero not running (graceful errors)
  • Tested bulk delete edge cases
  • Attempted path traversal in project name (safe — DB string, not filesystem)
  • Static analysis of all 19 changed files

@liuyixin-louis
Copy link
Copy Markdown
Collaborator

Full Feature Review — All Tests Passed

I did a thorough end-to-end test of every feature in this PR on a local dev build. Everything works correctly.

Test Results

Feature Status Details
BibTeX Import ✅ PASS Uploaded a 3-entry .bib file — all parsed correctly with title, authors, year, journal, abstract, DOI, citation key
Zotero Status Detection ✅ PASS Correctly distinguishes: not running / running with API disabled / fully available. UI shows appropriate message for each state
Zotero Collection Browser ✅ PASS Loads 297 collections, hierarchical navigation works, multi-select items (2/3 counter), "Select All" button
Zotero Item Import ✅ PASS Imported 2 papers from a Zotero collection — items appear with zotero source badge, correctly differentiated from bibtex imports
Reference Cards ✅ PASS Cards show title, truncated authors, year badge, journal, source badge. Checkboxes, "Chat about this", and "Delete" buttons all present
Reference Detail Modal ✅ PASS Full author list, DOI clickable link, abstract, item type. "Add to Chat", "Copy Context", "Close" actions
Search/Filter ✅ PASS Typing "BERT" in picker correctly filters from 5 to 1 result
Chat ReferencePicker ✅ PASS Multi-select works, count badge updates, "Add to Message" injects formatted reference text (Title/Authors/Year/Journal/DOI) into chat input
API Endpoints ✅ PASS GET /references, /zotero/status, /zotero/collections, /project/:name, /tags — all return correct responses with proper auth
i18n (zh-CN) ✅ PASS All reference UI strings properly translated (添加文献到消息, 搜索文献, 已选 N 篇, 文献库, etc.)

LGTM — suggest merging. @Zhang-Henry nice work on this one!

- Fix ZoteroBrowser mountedRef pattern for React StrictMode compatibility
  (StrictMode double-mount left mountedRef.current=false, preventing
  onImportComplete callback from firing → references never auto-refreshed)
- Add authorization checks on GET /:id, GET /:id/pdf, and project endpoints
- Sanitize PDF cache path to prevent path traversal via crafted citation keys
- Fix BibTeX parser: handle escaped quotes, global tilde replacement
- Add bulk delete size limit (max 500)
- Add collectionKey format validation, URL encoding on API calls
- Fix Zotero date parsing for "15 March 2024" format
- Add debounce to ReferenceSearchBar
- Await clipboard.writeText, reset file input after upload
- Add drag-drop file type validation in ImportDialog
@Zhang-Henry
Copy link
Copy Markdown
Collaborator Author

Review Issues Addressed — c4c0531

已修复 review 中提到的问题,包括一个导致 Zotero 导入后文献列表不自动刷新的 bug。

关键修复:Zotero 导入后不刷新

根因ZoteroBrowsermountedRef 模式在 React StrictMode 下有 bug。StrictMode 的 mount → unmount → remount 流程会导致 mountedRef.current 在 cleanup 后变为 false,但 effect body 没有重置为 true。结果是导入成功后 1200ms 的 setTimeout 回调检查 mountedRef.currentfalseonImportComplete() 永远不会被调用,refresh() 也就不会触发。

修复:在 useEffect body 中显式设置 mountedRef.current = true,确保 StrictMode 重新挂载后 ref 值正确。

其他修复

# 严重程度 修复内容
1 🔴 getReference() 添加 user_id 鉴权检查
2 🔴 PDF 缓存路径清洗,防止路径遍历攻击
3 🔴 项目文献关联接口添加用户鉴权
4 🟠 BibTeX 解析器处理转义引号
5 🟠 cleanLatex 波浪号替换添加全局标志 /~/g
6 🟠 bulkDeleteReferences 添加 500 条上限
11 🟡 ZoteroBrowser setTimeout + mountedRef 修复(即刷新 bug)
12 🟡 collectionKey 格式校验
13 🟡 Zotero 日期解析修复
14 🟡 API 调用添加 URL 编码
15 🟡 拖放非 .bib 文件时显示错误提示
16 🔵 ReferenceSearchBar 添加防抖
17 🔵 clipboard.writeText 添加 await
19 🔵 BibTeX 上传后重置 file input

Integrate both the reference library/picker (PR #61) and the
session tags + attached prompt features from main. Fix renamed
setAttachedImages → setAttachedFiles and openImagePicker → openFilePicker.
@Zhang-Henry Zhang-Henry merged commit 97f75e9 into main Mar 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants