Local documentation sync and hybrid search (BM25 + vector) with URL citations.
- Node.js 20+ (ESM)
- npm
npm installnpm run buildThe CLI is the primary entrypoint right now.
# Register a public documentation source
npm run cli -- add https://kubernetes.io/docs/ --name k8s --max-pages 500 --max-depth 5
# Sync the source (add --verbose for debug logs)
npm run cli -- sync <site_id> --verbose
# Search with citations
npm run cli -- search <site_id> "ServiceAccount what is it" --top-k 5
# Retrieve a full chunk by ID
npm run cli -- get-chunk <site_id> <chunk_id>Notes:
- Only public (no-auth) sources are supported.
- Embeddings use fastembed's multilingual model (
fast-multilingual-e5-large). - Data is stored under
./data/(sources metadata, hashes, LanceDB files). - No-match searches return an empty
resultslist.
src/mcp/server.ts exposes an in-process tool registry. A stdio MCP server is also available for the Inspector.
# Build the MCP server
npm run build
# Run MCP server over stdio (for Inspector)
npm run mcpInspector usage (opens a local UI and launches the server process):
npx @modelcontextprotocol/inspector node dist/mcp/stdio.js- Configuration defaults live in
src/config.ts. - Build output is in
dist/.