Qragy

Self-hosted AI support chatbot with RAG, human-in-the-loop, and full admin panel.
Runs on a Raspberry Pi. $0/month.

Quick Start · Features · Architecture · Admin Panel · Eval System · Integrations · Deploy · API · Configuration

Why Qragy?

Dify needs Docker, Redis, and Postgres. Botpress is cloud-only. Intercom costs $74/seat/month.

Qragy needs one command: npm start. It ships with 7 LLM providers, hybrid RAG, a full admin panel, human-in-the-loop agent inbox, eval testing framework, and multi-channel integrations — all in a single Node.js process with zero external infrastructure.

It uses LanceDB (embedded vector DB) and SQLite for storage, so you get a production-ready AI support chatbot with zero infrastructure cost — even on a $35 Raspberry Pi.

One process. One CSV file. 7 npm dependencies. Zero cloud bills.

Feature	Qragy	Dify	Botpress	Intercom
Fully self-hosted	Yes	Partial	No	No
Runs on Raspberry Pi	Yes	No	No	No
Min RAM	150 MB	4 GB+	2 GB+	N/A
Setup time	30 sec	30+ min	15+ min	N/A
LLM providers	7	10+	3	1
Hybrid RAG + CRAG	Yes	Partial	No	No
Human-in-the-loop	Yes	No	Yes	Yes
Eval framework	Built-in	No	No	No
Vector DB	Embedded	External	External	Managed
Admin panel	Built-in (25 panels)	Yes	Yes	Yes
Monthly cost	$0	Free tier limited	Free tier limited	$74+/seat
Open source	AGPL-3.0	Apache 2.0	AGPL	No
Dependencies	7 npm	Docker + Redis + Postgres	Cloud	Cloud

Features

AI & RAG Engine

7 LLM providers — Gemini, OpenAI, Anthropic (Claude), Groq, Mistral, DeepSeek, Ollama — all via raw fetch(), zero extra dependencies
Model fallback chain — automatic retry on 429/500/503/504 with configurable fallback models and circuit breaker health tracking
Hybrid search — vector search (LanceDB) + full-text search with Reciprocal Rank Fusion (RRF) scoring
3-tier reranker — Cohere API > LLM-based > text similarity scoring (automatic fallback)
CRAG (Corrective RAG) — evaluates search result relevance (RELEVANT/PARTIAL/IRRELEVANT), rewrites queries up to 2 times when results are insufficient
Adaptive pipeline — three modes: FAST (direct LLM, no retrieval), STANDARD (hybrid search + rerank), DEEP (search + rerank + CRAG + sub-queries)
Quality scoring — automatic response quality evaluation with confidence scores
Reflexion — self-reflection on low-quality answers, triggers re-generation with improved context
Smart chunking — markdown, recursive, sentence, and contextual document splitting strategies
Topic routing — keyword matching + AI classification routes conversations into structured support flows
Deterministic collection — bot gathers required fields step-by-step before escalating to a human
Conversation state machine — welcome_or_greet → topic_detection → topic_guided_support → escalation_handoff (+ farewell, fallback states)

Memory System

Core memory (MemGPT-style) — automatically extracts user profile facts (name, company, preferences) from conversations, injects into system prompt with 500-token budget
Recall memory — full conversation history with full-text search, enables cross-session context
Memory templates — configurable memory extraction patterns per use case

Knowledge Base

CSV-based Q&A — simple question/answer format, easy to edit and version
File upload — PDF, DOCX, XLSX, TXT parsing with automatic chunking and embedding
One-click re-embed — rebuild the entire vector index from the admin panel
Knowledge graph — extracts entities (product, issue_type, resolution, customer_segment) and relationships from resolved tickets, stored in SQLite
Content gap detection — identifies questions the bot cannot answer, suggests new KB entries
Auto-FAQ generation — generates FAQ entries from recurring ticket patterns

Security

Prompt injection guard — two-layer detection: 12 regex patterns (ignore instructions, jailbreak, DAN, roleplay) + LLM-based relevance guard for off-topic detection
Output validation — checks for AI confession phrases, prompt leaks, and internal data exposure
PII masking — automatic detection and masking of personal identifiable information in logs
Credential masking — prevents API keys and tokens from appearing in error messages or logs
Rate limiting — per-IP request throttling with configurable window and max requests
Security headers — Helmet.js, CORS configuration, CSP headers
HMAC-SHA256 webhooks — all outgoing webhooks are signed for verification

Infrastructure

Single process — one node server.js, no orchestration needed
Embedded databases — LanceDB (vectors) + SQLite (tickets, analytics, graph) — no external DB servers
File-based config — CSV knowledge base, JSON config files, markdown agent prompts
No build step — vanilla JS frontend, zero bundling or compilation
PWA support — service worker with offline caching (cache-first for static, network-first for API), web push notifications
Auto-deploy — GitHub webhook receiver with HMAC-SHA256 verification, triggers deploy.sh on push to main
Hot-reload — update bot config, topics, persona without restarting the server

Free Embedding Models

Provider	Model	Dimensions	Cost
Google Gemini (default)	`gemini-embedding-001`	3072	Free tier
OpenAI	`text-embedding-3-small`	1536	$0.02/1M tokens
Ollama	`nomic-embed-text`	768	Free (local)

Admin Panel

Full browser-based management — no code, no CLI needed. 25 panels organized by function:

Tickets & Conversations

Panel	Description
Live Chats	Active conversations with real-time updates, handoff status indicators
Closed Chats	Full chat histories with search, assignment, priority, internal notes
Search	Full-text search across all conversations and tickets
Agent Inbox (HITL)	Human-in-the-loop: claim conversations, live SSE updates, type replies in real-time

Knowledge & Content

Panel	Description
Knowledge Base	CRUD for Q&A entries, file upload (PDF/DOCX/XLSX/TXT), one-click vector re-embed
Auto-FAQ	Auto-generated FAQ entries from ticket patterns, one-click add to KB
Content Gaps	Unanswered questions the bot couldn't handle, prioritized by frequency

Bot Configuration

Panel	Description
Agent Files	Edit persona, soul, domain rules, hard-bans, response policy, escalation matrix — all from the browser
Topics	Create/edit/delete structured support flows with step-by-step guides and escalation rules
Memory Templates	Configure what user facts the bot should remember across conversations
Environment	View and edit `.env` variables at runtime
Chat Flow	Configure conversation behavior: greeting, fallback, collection mode, state transitions
Site Config	Widget appearance, company branding, business hours
Prompt Versions	Version history of all prompt changes with one-click rollback

Integrations

Panel	Description
Webhooks	Configure webhook endpoints, select event types, test delivery, view delivery logs
Zendesk Sunshine	Two-way Zendesk Sunshine Conversations integration config
WhatsApp	Direct WhatsApp Cloud API integration setup

Analytics & Insights

Panel	Description
Dashboard	Daily metrics, top topics, resolution rates, SVG charts
Analytics	Detailed conversation analytics, response time tracking
Feedback Report	User satisfaction (CSAT 1-5), thumbs up/down tracking
SLA Monitoring	Response time SLA compliance tracking

System & Testing

Panel	Description
Bot Test	Multi-chat grid — open parallel sessions to test the bot side by side
Admin Assistant	AI-powered assistant that can read/update config, manage KB, execute admin actions
Eval Management	Run eval scenarios, view results, manage test suites (see Eval System)
System Health	Uptime, memory usage, LLM health, audit log, database stats, backup/export
Setup Wizard	First-run onboarding: API key, bot name, persona, knowledge base setup — no `.env` editing needed

Eval System

Built-in evaluation framework for testing chatbot quality. Run scenarios from the admin panel or CLI.

How It Works

Scenarios — multi-turn conversation scripts with expected outcomes defined in tests/eval/scenarios.json
Judge — rule-based assertion engine (judge.js) that checks bot responses against 10 assertion types
Runner — sends messages to the live chat API, collects responses, runs assertions
Consensus — runs each scenario N times (default 3), distinguishes real failures from flaky results

Assertion Types

Assertion	Description
`shouldContainAny`	Response must contain at least one of the specified strings
`shouldNotContain`	Response must not contain any of the specified strings
`shouldNotContainAny`	Response must not contain any of the specified strings
`stateShouldBe`	Conversation state must match expected value
`topicShouldBe`	Detected topic must match expected value
`handoffReady`	Escalation handoff must be triggered (or not)
`earlyEscalation`	Check for premature escalation
`branchCodeShouldBe`	Extracted branch code must match
`isFarewell`	Response should (not) be a farewell message
`shouldNotRepeatPrevious`	Response must differ from previous reply (Jaccard similarity < 0.6)

Running Evals

From admin panel:

Open Eval Management panel
Click Run on a single scenario or Run All for the full suite
SSE streaming shows real-time progress with pass/fail results
Results are saved to history with pass rate and duration

From CLI:

npx vitest run tests/eval/chatbot-eval.test.js

Consensus Mode

When running all scenarios, each is executed 3 times (configurable). If at least 1 run passes, it's marked as flaky (acceptable). If all 3 fail, it's a real failure. This eliminates false negatives from LLM non-determinism.

Integrations

Embeddable Widget

Add Qragy to any website with one script tag:

<script>
  window.__QRAGY_API = "https://your-qragy-server.com";
</script>
<script src="https://your-qragy-server.com/embed.js"></script>

Cross-origin communication via postMessage. Customizable appearance from the admin panel.

Zendesk Sunshine Conversations

Two-way integration: bot handles initial conversation, escalates to Zendesk agent when needed. Zendesk agent replies flow back through the bot.

WhatsApp (Cloud API)

Direct WhatsApp Business Cloud API integration. Configure from the admin panel — no middleware required.

Telegram

Long-polling Telegram bot. Enable with TELEGRAM_ENABLED=true and your bot token.

Private Instance Mode

Keep Qragy public and customer deployments private by treating Qragy as the core app and each tenant as an overlay.

Keep public code in the Qragy repo.
Keep private KB, prompts, topics, branding, .env, uploads, and deploy scripts in a separate private instance directory.
Point Qragy to that directory with QRAGY_INSTANCE_DIR=/srv/qragy-instance or with explicit QRAGY_* path overrides.

Example private instance layout:

/srv/qragy-instance/
├── .env
├── agent/
├── memory/
├── public/
├── data/
└── knowledge_base.example.csv

At runtime Qragy will read agent/, memory/, data/, public/, and the knowledge-base CSV from the private instance directory while the application code stays in the public repo.

Webhooks

Send signed events to any endpoint (Slack, n8n, Zapier, custom):

HMAC-SHA256 signed with X-Qragy-Signature header
Event filtering — subscribe to specific events or * wildcard
Retry — 3 attempts with exponential backoff (1s, 2s, 4s)
Delivery log — last 200 deliveries with status and response
Max 10 webhooks per event

Auto-Deploy

GitHub webhook receiver at /deploy. Verifies X-Hub-Signature-256, triggers deploy.sh on push to main.

Architecture

graph TB
    subgraph Clients
        CW[Chat Widget]
        TG[Telegram Bot]
        EMB[Embed Script]
        WA[WhatsApp Cloud API]
        SC[Zendesk Sunshine]
    end

    subgraph "Qragy Server (single process)"
        EXP[Express.js API]
        TC[Topic Classifier]
        RAG[Adaptive RAG Pipeline]
        CRAG[CRAG Evaluator]
        QS[Quality Scorer + Reflexion]
        MEM[Core Memory]
        HITL[Agent Inbox / HITL]
        AST[Admin Assistant]
        TKT[Ticket System]
        ADM[Admin Panel - 25 panels]
        WH[Webhook Dispatcher]
        EVAL[Eval Runner]
        IG[Injection Guard]
    end

    subgraph Storage["Local Storage (no external DB)"]
        LDB[(LanceDB<br/>Vector Index)]
        SQL[(SQLite<br/>Tickets + Analytics + Graph)]
        CSV[(CSV + JSON<br/>Knowledge Base + Config)]
    end

    subgraph External
        LLM[LLM Provider<br/>Gemini / OpenAI / Claude<br/>Groq / Mistral / DeepSeek / Ollama]
        ZD[Zendesk Handoff]
        COH[Cohere Reranker]
    end

    CW & TG & EMB & WA & SC -->|HTTP / Long Poll / Webhook| EXP
    EXP --> IG -->|safe| TC
    TC -->|topic match| RAG
    RAG -->|vector + full-text| LDB
    RAG -->|CRAG check| CRAG
    RAG -->|read KB| CSV
    RAG -->|rerank| COH
    RAG -->|generate| LLM
    LLM --> QS -->|low quality| LLM
    EXP --> MEM --> SQL
    EXP --> TKT --> SQL
    EXP --> HITL -->|live SSE| ADM
    TKT -->|escalate| ZD
    TKT -->|notify| WH
    EXP --> AST -->|actions| ADM
    EVAL -->|test scenarios| EXP

Message Flow

Input — user sends a message via widget, Telegram, WhatsApp, or Zendesk
Injection guard — regex patterns + LLM relevance check filter malicious input
Topic detection — keyword matching + AI classification routes to structured flow
Adaptive RAG — FAST/STANDARD/DEEP pipeline based on query complexity
CRAG evaluation — checks search result relevance, rewrites query if needed
LLM generation — contextual reply with persona, topic guide, and memory
Quality scoring — evaluates response quality, triggers reflexion if low
Collection — bot gathers required fields step-by-step (deterministic mode)
Escalation — hands off to Zendesk, agent inbox, or webhook when criteria are met

Design Patterns

Factory + DI — services use createXxxService(deps), all dependencies injected
Route mounting — routes use mount(app, deps) pattern
CommonJS — all modules use require() / module.exports
Getter closures — mutable runtime config accessed via () => VALUE getters
Fire-and-forget — memory updates, analytics, graph building run async without blocking response

Quick Start

Docker (Fastest)

docker run -d -p 3001:3000 \
  -e GOOGLE_API_KEY=your_key \
  -v qragy-data:/app/data \
  ghcr.io/mahsumaktas/qragy

Open http://localhost:3001 — done.

From Source

# Clone & install
git clone https://github.com/mahsumaktas/qragy.git
cd qragy && npm install

# Configure (only GOOGLE_API_KEY is required)
cp .env.example .env
# Get a free key at https://aistudio.google.com

# Ingest your knowledge base
node scripts/ingest.js

# Run
npm start

Open localhost:3000 for the chatbot, localhost:3000/admin for the admin panel.

First run? The setup wizard at /admin guides you through API key, bot name, persona, and knowledge base — no .env editing needed.

Multi-Model Configuration

Qragy supports 7 LLM providers out of the box via raw fetch() — zero extra dependencies.

Gemini (Default, Free Tier)

GOOGLE_API_KEY=your_key_here

OpenAI

LLM_PROVIDER=openai
LLM_API_KEY=sk-...
LLM_MODEL=gpt-4o-mini
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small

Anthropic (Claude)

LLM_PROVIDER=anthropic
LLM_API_KEY=sk-ant-...
LLM_MODEL=claude-sonnet-4-6

Groq (Fast Inference)

LLM_PROVIDER=groq
LLM_API_KEY=gsk_...
LLM_MODEL=llama-3.3-70b-versatile

Mistral

LLM_PROVIDER=mistral
LLM_API_KEY=...
LLM_MODEL=mistral-large-latest

DeepSeek

LLM_PROVIDER=deepseek
LLM_API_KEY=...
LLM_MODEL=deepseek-chat
LLM_BASE_URL=https://api.deepseek.com/v1

Ollama (Fully Local, No API Key)

LLM_PROVIDER=ollama
LLM_MODEL=llama3.2
LLM_BASE_URL=http://localhost:11434/v1
EMBEDDING_PROVIDER=ollama
EMBEDDING_MODEL=nomic-embed-text
EMBEDDING_BASE_URL=http://localhost:11434

All GOOGLE_* environment variables continue to work for backward compatibility. See .env.example for all options.

Deploy

Render (One-Click)

Docker Compose

git clone https://github.com/mahsumaktas/qragy.git
cd qragy
cp .env.example .env   # add your API key
docker compose up -d

Raspberry Pi

git clone https://github.com/mahsumaktas/qragy.git
cd qragy && npm install
cp .env.example .env    # add your GOOGLE_API_KEY
node scripts/ingest.js
npm install -g pm2
pm2 start server.js --name qragy
pm2 save && pm2 startup

Any VPS / Docker

Works on any machine with Node.js 18+. No Docker required, but runs fine in a container too.

Configuration

Variable	Description	Default
`GOOGLE_API_KEY`	Gemini API key (required for Gemini)	—
`LLM_PROVIDER`	`gemini`, `openai`, `anthropic`, `groq`, `mistral`, `deepseek`, `ollama`	`gemini`
`LLM_API_KEY`	API key (falls back to `GOOGLE_API_KEY`)	—
`LLM_MODEL`	Chat model	—
`LLM_BASE_URL`	Custom base URL (Ollama, DeepSeek, etc.)	—
`GOOGLE_FALLBACK_MODEL`	Fallback model on error	—
`EMBEDDING_PROVIDER`	Embedding provider	`gemini`
`EMBEDDING_MODEL`	Embedding model	`gemini-embedding-001`
`BOT_NAME`	Bot display name	`QRAGY Bot`
`COMPANY_NAME`	Your company name	—
`ADMIN_TOKEN`	Admin panel password	—
`PORT`	Server port	`3000`
`RATE_LIMIT_ENABLED`	Per-IP rate limiting	`true`
`RATE_LIMIT_MAX`	Max requests per window	`20`
`DETERMINISTIC_COLLECTION_MODE`	Structured info gathering	`true`
`SUPPORT_HOURS_ENABLED`	Enforce business hours	`false`
`ZENDESK_ENABLED`	Enable Zendesk handoff	`false`
`TELEGRAM_ENABLED`	Enable Telegram bot	`false`
`TELEGRAM_BOT_TOKEN`	Telegram Bot API token	—
`WHATSAPP_ENABLED`	Enable WhatsApp integration	`false`
`WEBHOOK_ENABLED`	Enable webhook notifications	`false`
`WEBHOOK_URL`	Webhook endpoint URL	—
`WEBHOOK_SECRET`	HMAC-SHA256 signing secret	—

Full list in .env.example.

Embedding Widget

Add Qragy to any website:

<script>
  window.__QRAGY_API = "https://your-qragy-server.com";
</script>
<script src="https://your-qragy-server.com/embed.js"></script>

Project Structure

qragy/
├── server.js                          # Express app, middleware, route mounting
├── src/
│   ├── config/index.js                # Centralized env config loader + validation
│   ├── routes/
│   │   ├── admin/                     # Admin panel API (25 panels)
│   │   │   ├── index.js               # Route aggregator
│   │   │   ├── agent.js               # Agent config CRUD
│   │   │   ├── analytics.js           # Analytics + feedback
│   │   │   ├── config.js              # Runtime config management
│   │   │   ├── eval.js                # Eval CRUD + SSE test runner
│   │   │   ├── insights.js            # SLA, auto-FAQ, content gaps
│   │   │   ├── knowledge.js           # Knowledge base CRUD + upload
│   │   │   ├── system.js              # Health, audit log, backup
│   │   │   ├── tickets.js             # Ticket CRUD + bulk ops + prompt versions
│   │   │   └── webhooks.js            # Webhook config + test
│   │   ├── chat.js                    # POST /api/chat — main endpoint
│   │   ├── conversation.js            # Handoff, CSAT, upload, session
│   │   ├── deploy.js                  # GitHub webhook auto-deploy
│   │   ├── health.js                  # GET /api/health
│   │   └── widget.js                  # Widget configuration
│   ├── services/
│   │   ├── pipeline/chatPipeline.js   # Adaptive RAG pipeline (FAST/STANDARD/DEEP)
│   │   ├── rag/
│   │   │   ├── queryAnalyzer.js       # Query intent analysis
│   │   │   ├── reranker.js            # 3-tier reranker (Cohere > LLM > text)
│   │   │   ├── cragEvaluator.js       # Corrective RAG evaluation
│   │   │   └── contextualChunker.js   # Context-aware document chunking
│   │   ├── intelligence/
│   │   │   ├── qualityScorer.js       # Response quality scoring
│   │   │   ├── reflexion.js           # Self-reflection on low-quality answers
│   │   │   └── graphBuilder.js        # Knowledge graph builder
│   │   ├── memory/coreMemory.js       # MemGPT-style persistent user memory
│   │   ├── webChatPipeline.js         # Web chat orchestration
│   │   ├── chatProcessor.js           # Chat message processing
│   │   ├── topic.js                   # Topic classification (keyword + AI)
│   │   ├── escalation.js              # Escalation rule evaluation
│   │   ├── conversationManager.js     # Conversation CRUD + state tracking
│   │   ├── ticketStore.js             # Ticket CRUD + duplicate detection
│   │   ├── analytics.js               # Event buffer + daily aggregation
│   │   ├── knowledge.js               # KB search + content gaps
│   │   ├── webhooks.js                # HMAC-signed delivery with retry
│   │   ├── llmHealth.js               # LLM circuit breaker
│   │   ├── supportHours.js            # Business hours calculation
│   │   └── responseValidator.js       # Bot response safety validation
│   ├── prompt/builder.js              # System prompt assembly + token budgeting
│   ├── middleware/
│   │   ├── auth.js                    # Admin token authentication
│   │   ├── rateLimiter.js             # Per-IP rate limiting
│   │   ├── security.js                # CORS, Helmet, security headers
│   │   └── injectionGuard.js          # Prompt injection detection (regex + LLM)
│   └── utils/
│       ├── sanitizer.js               # PII masking, text normalization
│       ├── logger.js                  # Structured logger [ISO] [LEVEL] [context]
│       ├── validators.js              # Input validators (email, phone, etc.)
│       └── ...                        # Session, error helpers, CSV tools
├── lib/
│   ├── providers.js                   # Multi-model LLM + embedding abstraction
│   ├── chunker.js                     # Document chunking engine
│   └── db.js                          # SQLite database layer
├── agent/                             # Bot personality & configuration
│   ├── soul.md, persona.md            # Identity & tone
│   ├── domain.md                      # Domain knowledge rules
│   ├── topics/                        # Structured support flow definitions
│   ├── escalation-matrix.md           # Escalation rules
│   ├── hard-bans.md                   # Banned topics/responses
│   ├── output-filter.md               # Output filtering rules
│   └── templates/                     # Industry templates (e-commerce, restaurant, tech support)
├── tests/
│   ├── unit/                          # 108 test files, 560+ tests
│   ├── eval/
│   │   ├── scenarios.json             # 85 eval scenarios
│   │   ├── judge.js                   # Rule-based assertion engine
│   │   └── chatbot-eval.test.js       # Vitest eval runner
│   └── integration/                   # Integration tests
├── public/                            # Frontend (vanilla JS, no build step)
│   ├── admin.html, admin.js, admin.css # Admin panel
│   ├── embed.js                       # Embeddable widget script
│   └── sw.js                          # Service worker (PWA)
├── scripts/ingest.js                  # CSV → LanceDB embedding ingestion
├── Dockerfile                         # Docker image
├── docker-compose.yml                 # Container setup
├── .github/workflows/ci.yml           # CI: lint + test + coverage
└── data/                              # Runtime data (auto-created, gitignored)

Development

Prerequisites

Node.js 18+ (20+ recommended)
npm

Setup

git clone https://github.com/mahsumaktas/qragy.git
cd qragy && npm install
cp .env.example .env  # add your API key

Running Tests

npm test                  # Run all unit + integration tests (560+)
npm run test:coverage     # Run with V8 coverage report

Running Evals

# Start the server first, then:
npx vitest run tests/eval/chatbot-eval.test.js

Or use the admin panel Eval Management panel for a visual interface with SSE streaming.

Linting

npx eslint .              # Check for lint errors
npx eslint . --fix        # Auto-fix fixable issues

Code Quality

ESLint with flat config — enforces eqeqeq, no-var, prefer-const
Vitest for unit + integration testing with V8 coverage
CI pipeline — lint + test on every push/PR to main
Eval framework — 85 scenarios with consensus runs for regression testing

API

Interactive API documentation available at /api-docs when the server is running.

All admin endpoints require x-admin-token header when ADMIN_TOKEN is set.

View all endpoints

Chat

POST /api/chat — Send message, get AI response

Tickets

GET /api/admin/summary — Dashboard stats
GET /api/admin/tickets — List tickets (with pagination, filters)
GET /api/admin/tickets/:id — Ticket detail with full conversation
PUT /api/admin/tickets/:id/assign — Assign to team member
PUT /api/admin/tickets/:id/priority — Set priority level
POST /api/admin/tickets/:id/notes — Add internal note

Knowledge Base

GET /api/admin/knowledge — List entries
POST /api/admin/knowledge — Add entry
PUT /api/admin/knowledge/:id — Update entry
DELETE /api/admin/knowledge/:id — Delete entry
POST /api/admin/knowledge/reingest — Rebuild vector index
POST /api/admin/knowledge/upload — Upload PDF/DOCX/XLSX/TXT

Bot Config

GET/PUT /api/admin/agent/files/:name — Read/write agent files
GET/POST/PUT/DELETE /api/admin/agent/topics/:id — Topic CRUD
GET/PUT /api/admin/agent/memory/:name — Memory templates
GET/PUT /api/admin/env — Environment variables
POST /api/admin/agent/reload — Hot-reload config

Agent Inbox (HITL)

GET /api/admin/inbox/stream — SSE live updates
GET /api/admin/inbox/conversations — List claimable conversations
POST /api/admin/inbox/:id/claim — Claim conversation
POST /api/admin/inbox/:id/message — Send agent reply
POST /api/admin/inbox/:id/release — Release conversation

Admin Assistant

POST /api/admin/assistant — Send message (supports file context)

Eval

GET /api/admin/eval/scenarios — List all scenarios
GET /api/admin/eval/scenarios/:id — Get single scenario
POST /api/admin/eval/scenarios — Create scenario
PUT /api/admin/eval/scenarios/:id — Update scenario
DELETE /api/admin/eval/scenarios/:id — Delete scenario
POST /api/admin/eval/run/:id — Run single scenario
GET /api/admin/eval/run-all — SSE stream: run all scenarios with consensus
GET /api/admin/eval/history — Run history
DELETE /api/admin/eval/history — Clear history

Analytics & Insights

GET /api/admin/analytics — Metrics and charts
GET /api/admin/insights/sla — SLA monitoring
GET /api/admin/insights/auto-faq — Auto-generated FAQ
GET /api/admin/insights/content-gaps — Content gap detection
GET /api/admin/insights/feedback — Feedback reports

Webhooks

GET /api/admin/webhooks/config — Get config
PUT /api/admin/webhooks/config — Update config
POST /api/admin/webhooks/test — Send test webhook

Prompt Versions

GET /api/admin/agent/versions — List versions
POST /api/admin/agent/versions/rollback — Rollback to previous version

System

GET /api/admin/system — Health info, uptime, memory, DB stats
GET /api/health — Basic health check

Conversation

POST /api/conversation/:id/csat — Submit CSAT rating (1-5)
POST /api/conversation/:id/feedback — Thumbs up/down + reflexion trigger
POST /api/conversation/upload — File upload (image, PDF, 5MB limit)

Tech Stack

Layer	Technology
Runtime	Node.js 18+
Framework	Express.js
AI	Gemini, OpenAI, Anthropic, Groq, Mistral, DeepSeek, Ollama
Vector DB	LanceDB (embedded, serverless)
Database	SQLite (better-sqlite3)
Embeddings	Gemini / OpenAI / Ollama (configurable)
Reranking	Cohere API / LLM-based / text similarity (3-tier fallback)
Frontend	Vanilla JS — zero build step
Storage	JSON config, CSV knowledge base, LanceDB + SQLite files
Container	Docker (optional)
CI/CD	GitHub Actions
PWA	Service worker + web push

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

License

AGPL-3.0 — free for open-source use. Commercial/SaaS use requires sharing your modifications under the same license.

_{Built by Mahsum Aktas}

Name		Name	Last commit message	Last commit date
Latest commit History 167 Commits
.github		.github
admin-ui		admin-ui
agent		agent
docs		docs
lib		lib
memory		memory
public		public
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.nvmrc		.nvmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
deploy.sh		deploy.sh
docker-compose.yml		docker-compose.yml
eslint.config.js		eslint.config.js
knowledge_base.example.csv		knowledge_base.example.csv
openapi.json		openapi.json
package-lock.json		package-lock.json
package.json		package.json
render.yaml		render.yaml
server.js		server.js
vitest.config.js		vitest.config.js

Folders and files

Latest commit

History

Repository files navigation

Qragy

Why Qragy?

Features

AI & RAG Engine

Memory System

Knowledge Base

Security

Infrastructure

Free Embedding Models

Admin Panel

Tickets & Conversations

Knowledge & Content

Bot Configuration

Integrations

Analytics & Insights

System & Testing

Eval System

How It Works

Assertion Types

Running Evals

Consensus Mode

Integrations

Embeddable Widget

Zendesk Sunshine Conversations

WhatsApp (Cloud API)

Telegram

Private Instance Mode

Webhooks

Auto-Deploy

Architecture

Message Flow

Design Patterns

Quick Start

Docker (Fastest)

From Source

Multi-Model Configuration

Gemini (Default, Free Tier)

OpenAI

Anthropic (Claude)

Groq (Fast Inference)

Mistral

DeepSeek

Ollama (Fully Local, No API Key)

Deploy

Render (One-Click)

Docker Compose

Raspberry Pi

Any VPS / Docker

Configuration

Embedding Widget

Project Structure

Development

Prerequisites

Setup

Running Tests

Running Evals

Linting

Code Quality

API

Chat

Tickets

Knowledge Base

Bot Config

Agent Inbox (HITL)

Admin Assistant

Eval

Analytics & Insights

Webhooks

Prompt Versions

System

Conversation

Tech Stack

Contributing

License

About

Topics

Packages