Self-hosted AI support chatbot with RAG, human-in-the-loop, and full admin panel.
Runs on a Raspberry Pi. $0/month.
Quick Start · Features · Architecture · Admin Panel · Eval System · Integrations · Deploy · API · Configuration
Dify needs Docker, Redis, and Postgres. Botpress is cloud-only. Intercom costs $74/seat/month.
Qragy needs one command: npm start. It ships with 7 LLM providers, hybrid RAG, a full admin panel, human-in-the-loop agent inbox, eval testing framework, and multi-channel integrations — all in a single Node.js process with zero external infrastructure.
It uses LanceDB (embedded vector DB) and SQLite for storage, so you get a production-ready AI support chatbot with zero infrastructure cost — even on a $35 Raspberry Pi.
One process. One CSV file. 7 npm dependencies. Zero cloud bills.
| Feature | Qragy | Dify | Botpress | Intercom |
|---|---|---|---|---|
| Fully self-hosted | Yes | Partial | No | No |
| Runs on Raspberry Pi | Yes | No | No | No |
| Min RAM | 150 MB | 4 GB+ | 2 GB+ | N/A |
| Setup time | 30 sec | 30+ min | 15+ min | N/A |
| LLM providers | 7 | 10+ | 3 | 1 |
| Hybrid RAG + CRAG | Yes | Partial | No | No |
| Human-in-the-loop | Yes | No | Yes | Yes |
| Eval framework | Built-in | No | No | No |
| Vector DB | Embedded | External | External | Managed |
| Admin panel | Built-in (25 panels) | Yes | Yes | Yes |
| Monthly cost | $0 | Free tier limited | Free tier limited | $74+/seat |
| Open source | AGPL-3.0 | Apache 2.0 | AGPL | No |
| Dependencies | 7 npm | Docker + Redis + Postgres | Cloud | Cloud |
- 7 LLM providers — Gemini, OpenAI, Anthropic (Claude), Groq, Mistral, DeepSeek, Ollama — all via raw
fetch(), zero extra dependencies - Model fallback chain — automatic retry on 429/500/503/504 with configurable fallback models and circuit breaker health tracking
- Hybrid search — vector search (LanceDB) + full-text search with Reciprocal Rank Fusion (RRF) scoring
- 3-tier reranker — Cohere API > LLM-based > text similarity scoring (automatic fallback)
- CRAG (Corrective RAG) — evaluates search result relevance (RELEVANT/PARTIAL/IRRELEVANT), rewrites queries up to 2 times when results are insufficient
- Adaptive pipeline — three modes: FAST (direct LLM, no retrieval), STANDARD (hybrid search + rerank), DEEP (search + rerank + CRAG + sub-queries)
- Quality scoring — automatic response quality evaluation with confidence scores
- Reflexion — self-reflection on low-quality answers, triggers re-generation with improved context
- Smart chunking — markdown, recursive, sentence, and contextual document splitting strategies
- Topic routing — keyword matching + AI classification routes conversations into structured support flows
- Deterministic collection — bot gathers required fields step-by-step before escalating to a human
- Conversation state machine —
welcome_or_greet→topic_detection→topic_guided_support→escalation_handoff(+ farewell, fallback states)
- Core memory (MemGPT-style) — automatically extracts user profile facts (name, company, preferences) from conversations, injects into system prompt with 500-token budget
- Recall memory — full conversation history with full-text search, enables cross-session context
- Memory templates — configurable memory extraction patterns per use case
- CSV-based Q&A — simple question/answer format, easy to edit and version
- File upload — PDF, DOCX, XLSX, TXT parsing with automatic chunking and embedding
- One-click re-embed — rebuild the entire vector index from the admin panel
- Knowledge graph — extracts entities (product, issue_type, resolution, customer_segment) and relationships from resolved tickets, stored in SQLite
- Content gap detection — identifies questions the bot cannot answer, suggests new KB entries
- Auto-FAQ generation — generates FAQ entries from recurring ticket patterns
- Prompt injection guard — two-layer detection: 12 regex patterns (ignore instructions, jailbreak, DAN, roleplay) + LLM-based relevance guard for off-topic detection
- Output validation — checks for AI confession phrases, prompt leaks, and internal data exposure
- PII masking — automatic detection and masking of personal identifiable information in logs
- Credential masking — prevents API keys and tokens from appearing in error messages or logs
- Rate limiting — per-IP request throttling with configurable window and max requests
- Security headers — Helmet.js, CORS configuration, CSP headers
- HMAC-SHA256 webhooks — all outgoing webhooks are signed for verification
- Single process — one
node server.js, no orchestration needed - Embedded databases — LanceDB (vectors) + SQLite (tickets, analytics, graph) — no external DB servers
- File-based config — CSV knowledge base, JSON config files, markdown agent prompts
- No build step — vanilla JS frontend, zero bundling or compilation
- PWA support — service worker with offline caching (cache-first for static, network-first for API), web push notifications
- Auto-deploy — GitHub webhook receiver with HMAC-SHA256 verification, triggers
deploy.shon push to main - Hot-reload — update bot config, topics, persona without restarting the server
| Provider | Model | Dimensions | Cost |
|---|---|---|---|
| Google Gemini (default) | gemini-embedding-001 |
3072 | Free tier |
| OpenAI | text-embedding-3-small |
1536 | $0.02/1M tokens |
| Ollama | nomic-embed-text |
768 | Free (local) |
Full browser-based management — no code, no CLI needed. 25 panels organized by function:
| Panel | Description |
|---|---|
| Live Chats | Active conversations with real-time updates, handoff status indicators |
| Closed Chats | Full chat histories with search, assignment, priority, internal notes |
| Search | Full-text search across all conversations and tickets |
| Agent Inbox (HITL) | Human-in-the-loop: claim conversations, live SSE updates, type replies in real-time |
| Panel | Description |
|---|---|
| Knowledge Base | CRUD for Q&A entries, file upload (PDF/DOCX/XLSX/TXT), one-click vector re-embed |
| Auto-FAQ | Auto-generated FAQ entries from ticket patterns, one-click add to KB |
| Content Gaps | Unanswered questions the bot couldn't handle, prioritized by frequency |
| Panel | Description |
|---|---|
| Agent Files | Edit persona, soul, domain rules, hard-bans, response policy, escalation matrix — all from the browser |
| Topics | Create/edit/delete structured support flows with step-by-step guides and escalation rules |
| Memory Templates | Configure what user facts the bot should remember across conversations |
| Environment | View and edit .env variables at runtime |
| Chat Flow | Configure conversation behavior: greeting, fallback, collection mode, state transitions |
| Site Config | Widget appearance, company branding, business hours |
| Prompt Versions | Version history of all prompt changes with one-click rollback |
| Panel | Description |
|---|---|
| Webhooks | Configure webhook endpoints, select event types, test delivery, view delivery logs |
| Zendesk Sunshine | Two-way Zendesk Sunshine Conversations integration config |
| Direct WhatsApp Cloud API integration setup |
| Panel | Description |
|---|---|
| Dashboard | Daily metrics, top topics, resolution rates, SVG charts |
| Analytics | Detailed conversation analytics, response time tracking |
| Feedback Report | User satisfaction (CSAT 1-5), thumbs up/down tracking |
| SLA Monitoring | Response time SLA compliance tracking |
| Panel | Description |
|---|---|
| Bot Test | Multi-chat grid — open parallel sessions to test the bot side by side |
| Admin Assistant | AI-powered assistant that can read/update config, manage KB, execute admin actions |
| Eval Management | Run eval scenarios, view results, manage test suites (see Eval System) |
| System Health | Uptime, memory usage, LLM health, audit log, database stats, backup/export |
| Setup Wizard | First-run onboarding: API key, bot name, persona, knowledge base setup — no .env editing needed |
Built-in evaluation framework for testing chatbot quality. Run scenarios from the admin panel or CLI.
- Scenarios — multi-turn conversation scripts with expected outcomes defined in
tests/eval/scenarios.json - Judge — rule-based assertion engine (
judge.js) that checks bot responses against 10 assertion types - Runner — sends messages to the live chat API, collects responses, runs assertions
- Consensus — runs each scenario N times (default 3), distinguishes real failures from flaky results
| Assertion | Description |
|---|---|
shouldContainAny |
Response must contain at least one of the specified strings |
shouldNotContain |
Response must not contain any of the specified strings |
shouldNotContainAny |
Response must not contain any of the specified strings |
stateShouldBe |
Conversation state must match expected value |
topicShouldBe |
Detected topic must match expected value |
handoffReady |
Escalation handoff must be triggered (or not) |
earlyEscalation |
Check for premature escalation |
branchCodeShouldBe |
Extracted branch code must match |
isFarewell |
Response should (not) be a farewell message |
shouldNotRepeatPrevious |
Response must differ from previous reply (Jaccard similarity < 0.6) |
From admin panel:
- Open Eval Management panel
- Click Run on a single scenario or Run All for the full suite
- SSE streaming shows real-time progress with pass/fail results
- Results are saved to history with pass rate and duration
From CLI:
npx vitest run tests/eval/chatbot-eval.test.jsWhen running all scenarios, each is executed 3 times (configurable). If at least 1 run passes, it's marked as flaky (acceptable). If all 3 fail, it's a real failure. This eliminates false negatives from LLM non-determinism.
Add Qragy to any website with one script tag:
<script>
window.__QRAGY_API = "https://your-qragy-server.com";
</script>
<script src="https://your-qragy-server.com/embed.js"></script>Cross-origin communication via postMessage. Customizable appearance from the admin panel.
Two-way integration: bot handles initial conversation, escalates to Zendesk agent when needed. Zendesk agent replies flow back through the bot.
Direct WhatsApp Business Cloud API integration. Configure from the admin panel — no middleware required.
Long-polling Telegram bot. Enable with TELEGRAM_ENABLED=true and your bot token.
Keep Qragy public and customer deployments private by treating Qragy as the core app and each tenant as an overlay.
- Keep public code in the Qragy repo.
- Keep private KB, prompts, topics, branding,
.env, uploads, and deploy scripts in a separate private instance directory. - Point Qragy to that directory with
QRAGY_INSTANCE_DIR=/srv/qragy-instanceor with explicitQRAGY_*path overrides.
Example private instance layout:
/srv/qragy-instance/
├── .env
├── agent/
├── memory/
├── public/
├── data/
└── knowledge_base.example.csv
At runtime Qragy will read agent/, memory/, data/, public/, and the knowledge-base CSV from the private instance directory while the application code stays in the public repo.
Send signed events to any endpoint (Slack, n8n, Zapier, custom):
- HMAC-SHA256 signed with
X-Qragy-Signatureheader - Event filtering — subscribe to specific events or
*wildcard - Retry — 3 attempts with exponential backoff (1s, 2s, 4s)
- Delivery log — last 200 deliveries with status and response
- Max 10 webhooks per event
GitHub webhook receiver at /deploy. Verifies X-Hub-Signature-256, triggers deploy.sh on push to main.
graph TB
subgraph Clients
CW[Chat Widget]
TG[Telegram Bot]
EMB[Embed Script]
WA[WhatsApp Cloud API]
SC[Zendesk Sunshine]
end
subgraph "Qragy Server (single process)"
EXP[Express.js API]
TC[Topic Classifier]
RAG[Adaptive RAG Pipeline]
CRAG[CRAG Evaluator]
QS[Quality Scorer + Reflexion]
MEM[Core Memory]
HITL[Agent Inbox / HITL]
AST[Admin Assistant]
TKT[Ticket System]
ADM[Admin Panel - 25 panels]
WH[Webhook Dispatcher]
EVAL[Eval Runner]
IG[Injection Guard]
end
subgraph Storage["Local Storage (no external DB)"]
LDB[(LanceDB<br/>Vector Index)]
SQL[(SQLite<br/>Tickets + Analytics + Graph)]
CSV[(CSV + JSON<br/>Knowledge Base + Config)]
end
subgraph External
LLM[LLM Provider<br/>Gemini / OpenAI / Claude<br/>Groq / Mistral / DeepSeek / Ollama]
ZD[Zendesk Handoff]
COH[Cohere Reranker]
end
CW & TG & EMB & WA & SC -->|HTTP / Long Poll / Webhook| EXP
EXP --> IG -->|safe| TC
TC -->|topic match| RAG
RAG -->|vector + full-text| LDB
RAG -->|CRAG check| CRAG
RAG -->|read KB| CSV
RAG -->|rerank| COH
RAG -->|generate| LLM
LLM --> QS -->|low quality| LLM
EXP --> MEM --> SQL
EXP --> TKT --> SQL
EXP --> HITL -->|live SSE| ADM
TKT -->|escalate| ZD
TKT -->|notify| WH
EXP --> AST -->|actions| ADM
EVAL -->|test scenarios| EXP
- Input — user sends a message via widget, Telegram, WhatsApp, or Zendesk
- Injection guard — regex patterns + LLM relevance check filter malicious input
- Topic detection — keyword matching + AI classification routes to structured flow
- Adaptive RAG — FAST/STANDARD/DEEP pipeline based on query complexity
- CRAG evaluation — checks search result relevance, rewrites query if needed
- LLM generation — contextual reply with persona, topic guide, and memory
- Quality scoring — evaluates response quality, triggers reflexion if low
- Collection — bot gathers required fields step-by-step (deterministic mode)
- Escalation — hands off to Zendesk, agent inbox, or webhook when criteria are met
- Factory + DI — services use
createXxxService(deps), all dependencies injected - Route mounting — routes use
mount(app, deps)pattern - CommonJS — all modules use
require()/module.exports - Getter closures — mutable runtime config accessed via
() => VALUEgetters - Fire-and-forget — memory updates, analytics, graph building run async without blocking response
docker run -d -p 3001:3000 \
-e GOOGLE_API_KEY=your_key \
-v qragy-data:/app/data \
ghcr.io/mahsumaktas/qragyOpen http://localhost:3001 — done.
# Clone & install
git clone https://github.com/mahsumaktas/qragy.git
cd qragy && npm install
# Configure (only GOOGLE_API_KEY is required)
cp .env.example .env
# Get a free key at https://aistudio.google.com
# Ingest your knowledge base
node scripts/ingest.js
# Run
npm startOpen localhost:3000 for the chatbot, localhost:3000/admin for the admin panel.
First run? The setup wizard at
/adminguides you through API key, bot name, persona, and knowledge base — no.envediting needed.
Qragy supports 7 LLM providers out of the box via raw fetch() — zero extra dependencies.
GOOGLE_API_KEY=your_key_hereLLM_PROVIDER=openai
LLM_API_KEY=sk-...
LLM_MODEL=gpt-4o-mini
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-smallLLM_PROVIDER=anthropic
LLM_API_KEY=sk-ant-...
LLM_MODEL=claude-sonnet-4-6LLM_PROVIDER=groq
LLM_API_KEY=gsk_...
LLM_MODEL=llama-3.3-70b-versatileLLM_PROVIDER=mistral
LLM_API_KEY=...
LLM_MODEL=mistral-large-latestLLM_PROVIDER=deepseek
LLM_API_KEY=...
LLM_MODEL=deepseek-chat
LLM_BASE_URL=https://api.deepseek.com/v1LLM_PROVIDER=ollama
LLM_MODEL=llama3.2
LLM_BASE_URL=http://localhost:11434/v1
EMBEDDING_PROVIDER=ollama
EMBEDDING_MODEL=nomic-embed-text
EMBEDDING_BASE_URL=http://localhost:11434All GOOGLE_* environment variables continue to work for backward compatibility. See .env.example for all options.
git clone https://github.com/mahsumaktas/qragy.git
cd qragy
cp .env.example .env # add your API key
docker compose up -dgit clone https://github.com/mahsumaktas/qragy.git
cd qragy && npm install
cp .env.example .env # add your GOOGLE_API_KEY
node scripts/ingest.js
npm install -g pm2
pm2 start server.js --name qragy
pm2 save && pm2 startupWorks on any machine with Node.js 18+. No Docker required, but runs fine in a container too.
| Variable | Description | Default |
|---|---|---|
GOOGLE_API_KEY |
Gemini API key (required for Gemini) | — |
LLM_PROVIDER |
gemini, openai, anthropic, groq, mistral, deepseek, ollama |
gemini |
LLM_API_KEY |
API key (falls back to GOOGLE_API_KEY) |
— |
LLM_MODEL |
Chat model | — |
LLM_BASE_URL |
Custom base URL (Ollama, DeepSeek, etc.) | — |
GOOGLE_FALLBACK_MODEL |
Fallback model on error | — |
EMBEDDING_PROVIDER |
Embedding provider | gemini |
EMBEDDING_MODEL |
Embedding model | gemini-embedding-001 |
BOT_NAME |
Bot display name | QRAGY Bot |
COMPANY_NAME |
Your company name | — |
ADMIN_TOKEN |
Admin panel password | — |
PORT |
Server port | 3000 |
RATE_LIMIT_ENABLED |
Per-IP rate limiting | true |
RATE_LIMIT_MAX |
Max requests per window | 20 |
DETERMINISTIC_COLLECTION_MODE |
Structured info gathering | true |
SUPPORT_HOURS_ENABLED |
Enforce business hours | false |
ZENDESK_ENABLED |
Enable Zendesk handoff | false |
TELEGRAM_ENABLED |
Enable Telegram bot | false |
TELEGRAM_BOT_TOKEN |
Telegram Bot API token | — |
WHATSAPP_ENABLED |
Enable WhatsApp integration | false |
WEBHOOK_ENABLED |
Enable webhook notifications | false |
WEBHOOK_URL |
Webhook endpoint URL | — |
WEBHOOK_SECRET |
HMAC-SHA256 signing secret | — |
Full list in .env.example.
Add Qragy to any website:
<script>
window.__QRAGY_API = "https://your-qragy-server.com";
</script>
<script src="https://your-qragy-server.com/embed.js"></script>qragy/
├── server.js # Express app, middleware, route mounting
├── src/
│ ├── config/index.js # Centralized env config loader + validation
│ ├── routes/
│ │ ├── admin/ # Admin panel API (25 panels)
│ │ │ ├── index.js # Route aggregator
│ │ │ ├── agent.js # Agent config CRUD
│ │ │ ├── analytics.js # Analytics + feedback
│ │ │ ├── config.js # Runtime config management
│ │ │ ├── eval.js # Eval CRUD + SSE test runner
│ │ │ ├── insights.js # SLA, auto-FAQ, content gaps
│ │ │ ├── knowledge.js # Knowledge base CRUD + upload
│ │ │ ├── system.js # Health, audit log, backup
│ │ │ ├── tickets.js # Ticket CRUD + bulk ops + prompt versions
│ │ │ └── webhooks.js # Webhook config + test
│ │ ├── chat.js # POST /api/chat — main endpoint
│ │ ├── conversation.js # Handoff, CSAT, upload, session
│ │ ├── deploy.js # GitHub webhook auto-deploy
│ │ ├── health.js # GET /api/health
│ │ └── widget.js # Widget configuration
│ ├── services/
│ │ ├── pipeline/chatPipeline.js # Adaptive RAG pipeline (FAST/STANDARD/DEEP)
│ │ ├── rag/
│ │ │ ├── queryAnalyzer.js # Query intent analysis
│ │ │ ├── reranker.js # 3-tier reranker (Cohere > LLM > text)
│ │ │ ├── cragEvaluator.js # Corrective RAG evaluation
│ │ │ └── contextualChunker.js # Context-aware document chunking
│ │ ├── intelligence/
│ │ │ ├── qualityScorer.js # Response quality scoring
│ │ │ ├── reflexion.js # Self-reflection on low-quality answers
│ │ │ └── graphBuilder.js # Knowledge graph builder
│ │ ├── memory/coreMemory.js # MemGPT-style persistent user memory
│ │ ├── webChatPipeline.js # Web chat orchestration
│ │ ├── chatProcessor.js # Chat message processing
│ │ ├── topic.js # Topic classification (keyword + AI)
│ │ ├── escalation.js # Escalation rule evaluation
│ │ ├── conversationManager.js # Conversation CRUD + state tracking
│ │ ├── ticketStore.js # Ticket CRUD + duplicate detection
│ │ ├── analytics.js # Event buffer + daily aggregation
│ │ ├── knowledge.js # KB search + content gaps
│ │ ├── webhooks.js # HMAC-signed delivery with retry
│ │ ├── llmHealth.js # LLM circuit breaker
│ │ ├── supportHours.js # Business hours calculation
│ │ └── responseValidator.js # Bot response safety validation
│ ├── prompt/builder.js # System prompt assembly + token budgeting
│ ├── middleware/
│ │ ├── auth.js # Admin token authentication
│ │ ├── rateLimiter.js # Per-IP rate limiting
│ │ ├── security.js # CORS, Helmet, security headers
│ │ └── injectionGuard.js # Prompt injection detection (regex + LLM)
│ └── utils/
│ ├── sanitizer.js # PII masking, text normalization
│ ├── logger.js # Structured logger [ISO] [LEVEL] [context]
│ ├── validators.js # Input validators (email, phone, etc.)
│ └── ... # Session, error helpers, CSV tools
├── lib/
│ ├── providers.js # Multi-model LLM + embedding abstraction
│ ├── chunker.js # Document chunking engine
│ └── db.js # SQLite database layer
├── agent/ # Bot personality & configuration
│ ├── soul.md, persona.md # Identity & tone
│ ├── domain.md # Domain knowledge rules
│ ├── topics/ # Structured support flow definitions
│ ├── escalation-matrix.md # Escalation rules
│ ├── hard-bans.md # Banned topics/responses
│ ├── output-filter.md # Output filtering rules
│ └── templates/ # Industry templates (e-commerce, restaurant, tech support)
├── tests/
│ ├── unit/ # 108 test files, 560+ tests
│ ├── eval/
│ │ ├── scenarios.json # 85 eval scenarios
│ │ ├── judge.js # Rule-based assertion engine
│ │ └── chatbot-eval.test.js # Vitest eval runner
│ └── integration/ # Integration tests
├── public/ # Frontend (vanilla JS, no build step)
│ ├── admin.html, admin.js, admin.css # Admin panel
│ ├── embed.js # Embeddable widget script
│ └── sw.js # Service worker (PWA)
├── scripts/ingest.js # CSV → LanceDB embedding ingestion
├── Dockerfile # Docker image
├── docker-compose.yml # Container setup
├── .github/workflows/ci.yml # CI: lint + test + coverage
└── data/ # Runtime data (auto-created, gitignored)
- Node.js 18+ (20+ recommended)
- npm
git clone https://github.com/mahsumaktas/qragy.git
cd qragy && npm install
cp .env.example .env # add your API keynpm test # Run all unit + integration tests (560+)
npm run test:coverage # Run with V8 coverage report# Start the server first, then:
npx vitest run tests/eval/chatbot-eval.test.jsOr use the admin panel Eval Management panel for a visual interface with SSE streaming.
npx eslint . # Check for lint errors
npx eslint . --fix # Auto-fix fixable issues- ESLint with flat config — enforces
eqeqeq,no-var,prefer-const - Vitest for unit + integration testing with V8 coverage
- CI pipeline — lint + test on every push/PR to main
- Eval framework — 85 scenarios with consensus runs for regression testing
Interactive API documentation available at
/api-docswhen the server is running.
All admin endpoints require x-admin-token header when ADMIN_TOKEN is set.
View all endpoints
POST /api/chat— Send message, get AI response
GET /api/admin/summary— Dashboard statsGET /api/admin/tickets— List tickets (with pagination, filters)GET /api/admin/tickets/:id— Ticket detail with full conversationPUT /api/admin/tickets/:id/assign— Assign to team memberPUT /api/admin/tickets/:id/priority— Set priority levelPOST /api/admin/tickets/:id/notes— Add internal note
GET /api/admin/knowledge— List entriesPOST /api/admin/knowledge— Add entryPUT /api/admin/knowledge/:id— Update entryDELETE /api/admin/knowledge/:id— Delete entryPOST /api/admin/knowledge/reingest— Rebuild vector indexPOST /api/admin/knowledge/upload— Upload PDF/DOCX/XLSX/TXT
GET/PUT /api/admin/agent/files/:name— Read/write agent filesGET/POST/PUT/DELETE /api/admin/agent/topics/:id— Topic CRUDGET/PUT /api/admin/agent/memory/:name— Memory templatesGET/PUT /api/admin/env— Environment variablesPOST /api/admin/agent/reload— Hot-reload config
GET /api/admin/inbox/stream— SSE live updatesGET /api/admin/inbox/conversations— List claimable conversationsPOST /api/admin/inbox/:id/claim— Claim conversationPOST /api/admin/inbox/:id/message— Send agent replyPOST /api/admin/inbox/:id/release— Release conversation
POST /api/admin/assistant— Send message (supports file context)
GET /api/admin/eval/scenarios— List all scenariosGET /api/admin/eval/scenarios/:id— Get single scenarioPOST /api/admin/eval/scenarios— Create scenarioPUT /api/admin/eval/scenarios/:id— Update scenarioDELETE /api/admin/eval/scenarios/:id— Delete scenarioPOST /api/admin/eval/run/:id— Run single scenarioGET /api/admin/eval/run-all— SSE stream: run all scenarios with consensusGET /api/admin/eval/history— Run historyDELETE /api/admin/eval/history— Clear history
GET /api/admin/analytics— Metrics and chartsGET /api/admin/insights/sla— SLA monitoringGET /api/admin/insights/auto-faq— Auto-generated FAQGET /api/admin/insights/content-gaps— Content gap detectionGET /api/admin/insights/feedback— Feedback reports
GET /api/admin/webhooks/config— Get configPUT /api/admin/webhooks/config— Update configPOST /api/admin/webhooks/test— Send test webhook
GET /api/admin/agent/versions— List versionsPOST /api/admin/agent/versions/rollback— Rollback to previous version
GET /api/admin/system— Health info, uptime, memory, DB statsGET /api/health— Basic health check
POST /api/conversation/:id/csat— Submit CSAT rating (1-5)POST /api/conversation/:id/feedback— Thumbs up/down + reflexion triggerPOST /api/conversation/upload— File upload (image, PDF, 5MB limit)
| Layer | Technology |
|---|---|
| Runtime | Node.js 18+ |
| Framework | Express.js |
| AI | Gemini, OpenAI, Anthropic, Groq, Mistral, DeepSeek, Ollama |
| Vector DB | LanceDB (embedded, serverless) |
| Database | SQLite (better-sqlite3) |
| Embeddings | Gemini / OpenAI / Ollama (configurable) |
| Reranking | Cohere API / LLM-based / text similarity (3-tier fallback) |
| Frontend | Vanilla JS — zero build step |
| Storage | JSON config, CSV knowledge base, LanceDB + SQLite files |
| Container | Docker (optional) |
| CI/CD | GitHub Actions |
| PWA | Service worker + web push |
We welcome contributions! See CONTRIBUTING.md for guidelines.
AGPL-3.0 — free for open-source use. Commercial/SaaS use requires sharing your modifications under the same license.
Built by Mahsum Aktas
