A self-learning Text-to-SQL agent with natural language interface, built for production.
- Natural Language to SQL: Ask questions in plain English, get SQL queries and results
- Self-Learning: Improves with every interaction through feedback loops
- Multi-Database Support: PostgreSQL, MySQL, SQLite, and more
- Human-in-the-Loop: Validation for complex queries before execution
- Custom Instructions: Define project-specific rules and terminology
- Multiple Chats: Maintain context across conversations
- Plan & Confirm: See execution plans before running complex queries
┌─────────────────────────────────────────────────────────────────┐
│ Frontend (Next.js + shadcn) │
├─────────────────────────────────────────────────────────────────┤
│ Chat UI │ Projects │ Data Sources │ Custom Instructions │ KB │
└────────────────────────────────┬────────────────────────────────┘
│
┌────────────────────────────────▼────────────────────────────────┐
│ Backend API (FastAPI) │
├─────────────────────────────────────────────────────────────────┤
│ Orchestrator → Planner → SQL Agent → Validator → Learning │
└────────────────────────────────┬────────────────────────────────┘
│
┌────────────────────────────────▼────────────────────────────────┐
│ Services Layer │
├────────────┬────────────┬────────────┬────────────┬─────────────┤
│ LLM (Groq) │ DataSource │ Schema │ Knowledge │ Embeddings │
│ │ Connectors │ Indexer │ Base │ (Nomic) │
└────────────┴────────────┴────────────┴────────────┴─────────────┘
│
┌────────────────────────────────▼────────────────────────────────┐
│ Data Layer │
├─────────────────┬─────────────────┬─────────────────────────────┤
│ SQLite/PG │ ChromaDB │ User Data Sources │
│ (Metadata) │ (Vectors) │ (via Connectors) │
└─────────────────┴─────────────────┴─────────────────────────────┘
| Layer | Technology |
|---|---|
| Frontend | Next.js 14, shadcn/ui, Tailwind, Framer Motion |
| Backend | Python 3.12, FastAPI, Pydantic |
| LLM | Groq API (default), OpenAI, Ollama |
| Vector DB | ChromaDB |
| Embeddings | nomic-embed-text (via Ollama) |
| Database | SQLite (dev), PostgreSQL (prod) |
- Python 3.12+
- Node.js 20+
- Ollama (for local embeddings)
- Groq API Key (free at console.groq.com)
cd Text2Sql
# Backend
cd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Frontend
cd ../frontend
npm installCreate .env in the backend directory:
# LLM
LLM_PROVIDER=groq
GROQ_API_KEY=your-groq-api-key
# Embeddings (Ollama)
OLLAMA_BASE_URL=http://localhost:11434
# Database
DATABASE_URL=sqlite:///./datachat.db
# Debug
DEBUG=true# Install Ollama (macOS)
brew install ollama
# Start server
ollama serve
# Pull embedding model
ollama pull nomic-embed-text# Terminal 1: Backend
cd backend
source venv/bin/activate
uvicorn backend.main:app --reload --port 8000
# Terminal 2: Frontend
cd frontend
npm run devVisit http://localhost:3000
text2sql/
├── backend/
│ ├── api/ # FastAPI routes
│ ├── agents/ # Agent implementations (SOLID)
│ │ ├── orchestrator.py
│ │ ├── planner.py
│ │ ├── sql_agent.py
│ │ ├── validator.py
│ │ └── learning.py
│ ├── services/ # Business logic
│ │ ├── llm/ # LLM providers
│ │ ├── datasources/ # DB connectors
│ │ ├── embeddings/ # Embedding service
│ │ ├── knowledge/ # Schema indexer
│ │ └── vector_store/ # ChromaDB
│ ├── models/ # Pydantic models
│ ├── db/ # SQLAlchemy
│ ├── core/ # Config & utils
│ └── main.py
│
├── frontend/
│ ├── app/ # Next.js pages
│ ├── components/ # React components
│ │ ├── ui/ # shadcn components
│ │ └── chat/ # Chat components
│ ├── lib/ # Utilities
│ └── hooks/ # React hooks
│
├── docker-compose.yaml
└── README.md
Projects organize your data sources and custom instructions.
Connect to your database:
- PostgreSQL, MySQL, SQLite
- CSV/Parquet files
- REST APIs
Ask questions naturally:
- "Show me top 10 products by revenue"
- "What's the average order value this month?"
- "List customers who churned last quarter"
Help the agent learn:
- 👍 Correct - adds to knowledge base
- 👎 Incorrect - learn from corrections
Add project-specific rules:
- "status field uses values: active, pending, cancelled"
- "Always filter by tenant_id"
- "Revenue = quantity * unit_price"
| Endpoint | Method | Description |
|---|---|---|
/api/projects |
GET | List projects |
/api/projects |
POST | Create project |
/api/chat/message |
POST | Send chat message |
/api/chat/sessions/{id} |
GET | List chat sessions |
/api/datasources |
POST | Add data source |
/api/datasources/{id}/sync |
POST | Sync schema |
/api/feedback |
POST | Submit feedback |
# Groq (default - fast & free)
LLM_PROVIDER=groq
GROQ_API_KEY=your-key
# OpenAI
LLM_PROVIDER=openai
OPENAI_API_KEY=your-key
# Ollama (local)
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434llama-3.3-70b-versatile(default)llama-3.1-8b-instantmixtral-8x7b-32768
cd backend
pytest# Backend
black backend/
ruff check backend/
# Frontend
npm run lint# Build and run
docker-compose up -d
# View logs
docker-compose logs -f- Set
DEBUG=false - Use PostgreSQL for metadata
- Configure proper CORS origins
- Set up rate limiting
- Enable SSL/TLS
- Configure logging
- Set up monitoring
MIT