Your site's shipboard computer.
Hybrid semantic + keyword search for static sites, with optional experimental Q&A — fully client-side, no server required. Runs entirely in your visitor's browser via WebAssembly.
"I'm just so happy to be doing this for you." — Eddie, the Heart of Gold's shipboard computer
Eddie does three things:
-
Build time: A CLI reads your markdown/HTML content, chunks it, and generates embeddings using a sentence-transformer model. The result is a compact binary index shipped as a static asset. Simple, elegant, like a fjord.
-
Runtime: A WASM module in the browser downloads the same embedding model (from HuggingFace CDN, cached after first use), embeds the visitor's query, and performs hybrid semantic + keyword search against the pre-built index.
-
Optional Q&A (experimental): On browsers with WebGPU support, a small language model can synthesize a short answer from retrieved content. This is still experimental and falls back gracefully to search-only on browsers without WebGPU.
eddie index --content-dir content/ --output static/eddie/index.ed<script src="/eddie-widget.js"></script>Visitors see a floating search button. First search triggers a one-time model download (~87MB cache footprint with the default model), then searches are instant. The answer to how long subsequent queries take is not 42 — it's closer to 42 milliseconds.
Search-in-progress screenshots with Eddie installed on each supported CMS integration:
| Hugo | Astro | Docusaurus |
|---|---|---|
![]() |
![]() |
![]() |
| MkDocs | Eleventy | Jekyll |
|---|---|---|
![]() |
![]() |
![]() |
Refresh these screenshots with:
bash scripts/capture-cms-gallery.shSee docs/guides/cms-gallery.md for full workflow and options.
The release pipeline now emits sidecar assets for runtime files:
eddie.wasm.br/eddie.wasm.gzeddie-wasm.js.br/eddie-wasm.js.gzeddie-worker.js.br/eddie-worker.js.gzeddie-widget.js.br/eddie-widget.js.gz
Use normal URLs in HTML (eddie-widget.js, eddie.wasm, etc). Browsers should receive compressed bytes through standard HTTP content negotiation (Accept-Encoding).
Important: browser JS should not manually switch to .br filenames unless your host also sets correct response headers (Content-Encoding: br and the correct content type). Without those headers, .br files are just opaque bytes.
If you use eddie-hugo, you can scaffold local, gitignored settings + a one-command index wrapper:
bash scripts/eddie-init-site.sh /path/to/your-hugo-siteThis creates:
.eddie/local.env(machine-local settings, gitignored).eddie/claims.edits.toml(optional manual claim edits, gitignored)scripts/eddie-index.sh(convenience wrapper aroundeddie index)
Then you can index with:
bash scripts/eddie-index.shYou can tune index quality by trading off build time:
# Fast profile
EDDIE_MODEL=sentence-transformers/all-MiniLM-L6-v2
EDDIE_CHUNK_SIZE=256
EDDIE_OVERLAP=32
EDDIE_QA_OLLAMA_MODEL=
# Quality-first local workstation profile
EDDIE_MODEL=sentence-transformers/all-MiniLM-L12-v2
EDDIE_CHUNK_SIZE=320
EDDIE_OVERLAP=48
EDDIE_QA_OLLAMA_MODEL=qwen2.5:32b
EDDIE_QA_OLLAMA_MAX_CHUNKS=96
EDDIE_QA_OLLAMA_MAX_PAIRS_PER_CHUNK=4
EDDIE_QA_OLLAMA_TEMPERATURE=0.1Guideline:
- Larger embedder + larger chunks typically improve recall/answer grounding.
- Stronger Ollama model at index time improves QA/claim synthesis quality.
- Build time and memory usage increase substantially.
Keep acceptance tests in your site repo, not inside Eddie.
Example suite:
cp examples/acceptance-suite.json /path/to/your-site/eddie.acceptance.jsonRun automated parameter sweep:
eddie tune \
--content-dir content/ \
--eval eddie.acceptance.json \
--chunk-sizes 192,256,320 \
--overlaps 16,32,48 \
--mode hybrid \
--report tune-report.jsonRun guided interactive loop (collect query, expected phrases, user rating, then re-tune):
eddie tune \
--content-dir content/ \
--interactive \
--save-eval eddie.acceptance.jsonEddie now includes a configurable benchmark harness for model/dataset matrix runs.
What it does:
- Caches benchmark corpora locally (git sparse checkouts, excluded from git).
- Runs clean index/search timing loops across any dataset/model combination.
- Optionally uses OpenRouter:
- larger model to generate stable query sets per dataset
- smaller model to judge retrieval quality for sampled queries
- Writes machine-friendly outputs (
CSV, optionalParquet) plus markdown summary tables. - Computes deterministic retrieval metrics (
Hit@k,MRR,nDCG@k) from human-maintained labels inbenchmarks/relevance_labels.toml.
Default benchmark config: benchmarks/benchmark.toml
Deterministic relevance labels: benchmarks/relevance_labels.toml
Prepare datasets:
python3 scripts/benchmark_suite.py prepareGenerate stored query files (stable benchmark inputs):
export OPENROUTER_API_KEY=...
python3 scripts/benchmark_suite.py generate-queriesRun full matrix:
python3 scripts/benchmark_suite.py run --generate-queriesFilter example (single dataset/model quick run):
python3 scripts/benchmark_suite.py run \
--dataset fastapi_docs \
--model sentence-transformers/multi-qa-MiniLM-L6-cos-v1 \
--runs-per-combo 1 \
--query-limit 10Render benchmark table:
python3 scripts/benchmark_suite.py render-report .bench/results/<run_id>
# or:
python3 scripts/render_benchmark_report.py .bench/results/<run_id>Eddie now stores everything in one index.ed file:
chunkssection: primary hybrid retrieval corpus (semantic + BM25).qasection (optional): question/answer pairs with source attribution.claimssection (optional): atomic factual claims with evidence and source attribution.summarylane (optional): lightweight doc summaries stored as retrieval chunks.
Note: this uses a new v4 index layout; rebuild indexes with the current Eddie binary.
Build with embedded QA + claims:
eddie index \
--content-dir content/ \
--output static/eddie/index.ed \
--chunk-strategy semantic \
--coarse-chunk-size 640 \
--coarse-overlap 96 \
--summary-lane \
--qa \
--claimsBuild with local Ollama QA synthesis (good fit for a 4090 workstation):
eddie index \
--content-dir content/ \
--output static/eddie/index.ed \
--chunk-strategy semantic \
--coarse-chunk-size 640 \
--summary-lane \
--qa \
--claims \
--qa-ollama-model qwen2.5:7b-instruct \
--qa-ollama-url http://127.0.0.1:11434/api/generateBuild with OpenRouter QA synthesis (optional, local-first still supported):
export OPENROUTER_API_KEY=...
eddie index \
--content-dir content/ \
--output static/eddie/index.ed \
--chunk-strategy semantic \
--coarse-chunk-size 640 \
--summary-lane \
--qa \
--claims \
--qa-openrouter-model openai/gpt-4.1-mini \
--qa-openrouter-url https://openrouter.ai/api/v1/chat/completions \
--qa-openrouter-api-key-env OPENROUTER_API_KEYSearch specific lanes:
eddie search --index static/eddie/index.ed --query "who has the subject worked for" --mode hybrid --scope all
eddie search --index static/eddie/index.ed --query "how many years has the subject been programming" --mode semantic --scope qa
eddie search --index static/eddie/index.ed --query "worked for nike" --mode semantic --scope claimsRecommended architecture:
- Build time: use larger model (Ollama/OpenRouter) to synthesize high-signal QA and claims.
- Runtime: use retrieval-first from
chunksand optionally blendqa/claimslanes. - Answer generation: keep small runtime models grounded on cited evidence.
- Runtime ranking now includes recency decay boost and multi-granularity fusion bonus.
You can manually add or redact claims without graph tooling.
Create claims.edits.toml:
[[redact]]
predicate = "worked_for"
object = "Old Company"
[[add]]
subject = "Site Subject"
predicate = "worked_for"
object = "Nike"
evidence = "Manual correction"
source_url = "/about/"
confidence = 1.0
tags = ["manual"]Apply during index build:
eddie index \
--content-dir content/ \
--output static/eddie/index.ed \
--claims \
--claims-edits claims.edits.tomlExport standalone corpora (optional debug/inspection):
eddie qa-corpus --index static/eddie/index.ed --output static/eddie/qa-corpus.json
eddie claims-corpus --index static/eddie/index.ed --output static/eddie/claims-corpus.json --claims-edits claims.edits.tomlEddie is built around fast hybrid retrieval (semantic + BM25) with snippets and ranking. Q&A is included as an optional experimental layer on top.
| Tool | Deployment | Search | Q&A | Server | Cost |
|---|---|---|---|---|---|
| Eddie | Client (WASM) | Hybrid semantic + BM25 | Experimental (WebGPU) | No | Free |
| Pagefind | Client (WASM) | Keyword | No | No | Free |
| Algolia DocSearch | Cloud | Keyword + neural | No | Yes | Free for OSS |
| kapa.ai | Cloud | Semantic (RAG) | Yes | Yes | Enterprise |
| DocsBot | Cloud | Semantic (RAG) | Yes | Yes | $16–$416/mo |
Create eddie.toml in your site root (optional — defaults are carefully chosen, unlike Marvin's personality):
[embedding]
model = "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"
[qa]
enabled = true
runtime = "webllm"
model = "HuggingFaceTB/SmolLM2-1.7B-Instruct"
[widget]
theme = "auto"
position = "bottom-right"
offsetX = 0
offsetY = 0
qaMode = "auto"
qaSubject = "site subject"
topK = 8
answerTopK = 5position accepts "top-left", "top-right", "bottom-left", or "bottom-right".
Use offsetX/offsetY (pixels, can be negative) to nudge the launcher away from theme toggles or other fixed UI.
qaMode accepts "off", "auto", or "always" for experimental factual answer blending.
qaSubject lets site owners tune subject-specific factual prompts like "does <subject> know X".
The default model (multi-qa-MiniLM-L6-cos-v1) is Apache 2.0 and tuned for retrieval tasks. Models are fetched from HuggingFace CDN at runtime — Eddie doesn't redistribute model weights. If training data provenance matters to you:
| Model | License | Params |
|---|---|---|
BAAI/bge-small-en-v1.5 |
MIT | 33M |
Snowflake/snowflake-arctic-embed-s |
Apache 2.0 | 33M |
nomic-ai/modernbert-embed-base |
Apache 2.0 | 110M |
Eddie is a single Rust codebase that compiles to two targets — one might say it's improbably versatile:
- Native CLI (
eddie) — runs at build time to index your content - WASM module — runs in the browser for search and embedding
flowchart LR
A[Markdown / HTML Content] --> B[Parse + Clean]
B --> C[Chunking<br/>heading or semantic]
C --> D[Embeddings]
C --> E[BM25]
C --> F[QA / Claims Extraction<br/>optional]
D --> G[index.ed Sections]
E --> G
F --> G
G --> H[Compressed Static Asset]
flowchart LR
A[Visitor Opens Widget] --> B[Load Worker + WASM]
B --> C[Fetch index.ed]
C --> D[Load Model Files<br/>cached in browser]
D --> E[Query]
E --> F[Hybrid Retrieval<br/>semantic + BM25]
F --> G[Optional QA / Claims Blend]
G --> H[Ranked Results + Summary UI]
The indexing pipeline:
Markdown/HTML → parse → chunk → embed (MiniLM, 384-dim) → BM25 index
↘ optional QA/claims synthesis + embeddings
→ serialize sections into one index.ed → Brotli pack (.ed)
The search pipeline (browser):
Query → download model (first use) → embed query → cosine similarity + BM25 → RRF fusion → ranked results
ML inference uses Candle (HuggingFace's Rust ML framework), which compiles to WASM without complaint. This is neural network inference running in a browser to search a blog — far more intelligent than the task demands, which Eddie would tell you is exactly how he likes it.
The core product value today is hybrid retrieval and result summaries/snippets. Q&A is a best-effort experimental mode and quality is highly dependent on your corpus and client hardware.
What this means in practice:
- Smaller browser models can produce useful summaries, but factual precision can drift.
- Broader or inconsistent corpora increase hallucination and contradiction risk.
- On weaker devices, latency can be too high for a good UX.
Suggested model range to try (if supported by your chosen runtime/toolchain):
HuggingFaceTB/SmolLM2-1.7B-Instruct(current default)Qwen/Qwen2.5-1.5B-Instructmicrosoft/Phi-3.5-mini-instruct
Corpus strategies that generally improve answer quality:
- Keep content focused by domain/use-case rather than mixing unrelated material.
- Prefer explicit, factual writing with clear headings and stable terminology.
- Add canonical FAQ and glossary pages for key entities, policies, and definitions.
- Keep time-sensitive pages date-stamped and archive/redirect stale copies.
- Avoid indexing low-signal pages (thin marketing copy, duplicate boilerplate).
Likely future direction:
- Use larger LLMs at index/build time (offline) to generate structured QA artifacts.
- Produce question-answer pairs grounded in source chunks.
- Build entity/fact cards with citations and claim-to-source maps.
- Keep browser runtime lightweight for retrieval and synthesis over precomputed evidence.
This should improve factual reliability without requiring large on-device models for every query, while browser hardware catches up for stronger local Q&A.
Foundational reading and implementation references behind the current approach:
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
- MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
- Reciprocal Rank Fusion Outperforms Condorcet and Individual Rank Learning Methods
- The Probabilistic Relevance Framework: BM25 and Beyond
- WebAssembly
- WebGPU
- Hugging Face Candle
Recent papers and resources we track for factual grounding with smaller runtime models:
- SR-RAG: Learning to Retrieve and Reason in LLMs via Self-Reflection
- Adaptive Retrieval without Self-Knowledge?
- CER: Contrastive Evidence Re-ranking for Attributed Generation
- BRIGHT: A Realistic and Challenging Benchmark for Retrieval
- GraphRAG (paper)
- GraphRAG project updates
- name: Index content
run: |
curl -L https://github.com/jt55401/eddie/releases/latest/download/eddie-linux-amd64 -o eddie
chmod +x eddie
./eddie index --content-dir content/ --output public/eddie/index.edsrc/ Rust source (CLI + WASM shared core)
requirements/ Requirements-as-code
docs/plans/ Design documents
This project uses requirements-as-code. See requirements.md for the full requirements tree.
See CONTRIBUTING.md. Pull requests welcome — just don't ask Eddie to be less cheerful about it.
GPL-3.0-only. See LICENSE.
Copyright (c) 2026 Jason Grey. Project name and branding are not licensed under GPL — see TRADEMARKS.md.
If you find Eddie useful, use the GitHub Sponsor button on the repository.
For commercial integration or support, Improbability Engineers offers consulting — they built the ship, after all.
Eddie is the Heart of Gold shipboard computer from The Hitchhiker's Guide to the Galaxy. The Heart of Gold is powered by the Infinite Improbability Drive. Improbability Engineers builds the ship's computer.
So long, and thanks for all the search results.






