Skip to content

MinuteHanD/onebox

Repository files navigation

Onebox Email Aggregator

This application implements a onebox email aggregator with real-time IMAP synchronization, Elasticsearch search, AI-based categorization, Slack/Webhook notifications, and AI-powered suggested replies via a vector database (Qdrant).

Architecture

Components:

  • Backend API (Node.js, TypeScript, Express)
  • IMAP persistent connections (multi-account, IDLE) — real-time updates, no cron
  • Elasticsearch for indexing and search (Docker)
  • Qdrant vector database for RAG knowledge (Docker)
  • OpenAI for categorization and RAG (with rule-based fallback on limits)
  • Frontend UI (React + Vite)
  • Notifications via Slack and generic Webhooks

Data flow:

  1. IMAP connects to each account, syncs last 30 days per folder, enables IDLE for live updates.
  2. Emails are parsed and bulk indexed into Elasticsearch.
  3. New emails (from IDLE) are categorized by AI; Interested triggers Slack + webhook.
  4. REST API exposes search, retrieval, actions, health and RAG endpoints.
  5. Frontend consumes the API to display, search, filter, and request suggested replies.

Architecture Diagram:

Architecture Diagram

Prerequisites

  • Node.js 18+ and npm
  • Docker and Docker Compose
  • Two IMAP accounts with app passwords (Gmail recommended)
  • OpenAI API key (for categorization and RAG)
  • Optional: Slack webhook URL, external webhook URL (e.g., webhook.site)

Quick Start

  1. Start required services (Elasticsearch, Kibana, Qdrant)
docker-compose up -d
  1. Configure backend environment
# Create and edit backend/.env
# Fill in IMAP accounts, OpenAI, Slack/Webhook as needed

Key variables (examples):

  • Server: PORT=3000, NODE_ENV=development
  • Elasticsearch: ELASTICSEARCH_URL=http://localhost:9200, ELASTICSEARCH_INDEX=reachinbox_emails
  • IMAP (two accounts min): IMAP1_*, IMAP2_* (host, port, user, password, tls)
  • OpenAI: OPENAI_API_KEY=..., OPENAI_ENABLED=true, OPENAI_MAX_CONCURRENT=2
  • Slack: SLACK_WEBHOOK_URL=...
  • Webhook: WEBHOOK_URL=...
  • Qdrant (Vector DB): QDRANT_URL=http://localhost:6333, KNOWLEDGE_INDEX=reachinbox_knowledge, EMBEDDING_MODEL=text-embedding-3-small, EMBEDDING_DIMS=1536
  1. Run backend
cd backend
npm install
npm run dev

Backend: http://localhost:3000

  1. Run frontend
cd frontend
npm install
npm run dev

Frontend: http://localhost:3001

API Endpoints

Base path: /api/emails

Email search and retrieval:

  • GET /search — supports: query, accountId, folder, category, from, to, subject, dateFrom, dateTo, isRead, isStarred, hasAttachments, limit, offset
  • GET /:emailId — fetch a single email by ID

Email actions:

  • PATCH /:emailId/category — body: { "category": "Interested|Meeting Booked|Not Interested|Spam|Out Of Office" }
  • PATCH /:emailId/read — body: { "isRead": true|false }
  • PATCH /:emailId/starred — body: { "isStarred": true|false }

Accounts and folders:

  • GET /accounts/list
  • GET /accounts/:accountId/folders
  • POST /accounts/:accountId/sync

Utilities:

  • POST /recategorize — optional ?accountId=...
  • GET /categories/list
  • GET /status/health

Health:

  • GET /health

RAG (AI-Powered Suggested Replies):

  • POST /rag/suggest-reply — body: { "emailId": "..." } — generates a contextual reply
  • GET /rag/status — RAG service status and knowledge base info
  • GET /rag/knowledge — current (seeded) knowledge items

Frontend Features

  • Display list of emails with subjects, previews, folders, and categories
  • Select an email to see details and change category
  • Filters: account, category, folder, from, to; full-text search
  • Partial, case-insensitive matches for folder/from/to (e.g., "Slack", "inb")
  • One-click “Suggest Reply” using RAG

Requirements Coverage

  1. Real-Time Email Synchronization — Implemented

    • Multiple IMAP accounts (>=2)
    • Sync last 30 days across all folders
    • Persistent connections with IDLE for real-time updates (no cron)
  2. Searchable Storage using Elasticsearch — Implemented

    • Local Elasticsearch via Docker (with Kibana)
    • Indexing with mappings; search API
    • Filtering by account and folder; extended partial matching for folder/from/to
  3. AI-Based Email Categorization — Implemented

    • Uses OpenAI with a robust fallback classifier
    • Categories: Interested, Meeting Booked, Not Interested, Spam, Out Of Office
    • Batch recategorization supported
  4. Slack and Webhook Integration — Implemented

    • Slack notification on Interested emails
    • Webhook trigger on Interested emails for external automation
  5. Frontend Interface — Implemented

    • UI for listing, viewing, filtering, searching and categorization
    • Shows AI category and supports updates
  6. AI-Powered Suggested Replies (RAG) — Implemented

    • Knowledge stored in Qdrant vector DB
    • Retrieval-Augmented Generation using OpenAI
    • Seeded with example “job application / booking link” guidance

Troubleshooting

  • Elasticsearch not ready: ensure Docker services are up and 9200 responds.
  • Qdrant connection refused: docker-compose up -d qdrant and set QDRANT_URL.
  • IMAP (Gmail) auth: use app passwords; ensure IMAP is enabled.
  • OpenAI rate limits: temporarily set OPENAI_ENABLED=false, then use /recategorize later.
  • Slack/Webhook: verify URLs; check /api/emails/status/health.

Notes

  • Historical sync indexes emails but doesn’t trigger AI categorization to avoid heavy costs; live emails via IDLE are categorized immediately.
  • Search behavior: from/to accept names, domains or emails (partial match supported); folder supports partial matches (e.g., "inb" → INBOX).

About

onebox email aggregator

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages