🚀 BenchPress

A prototype system for transforming real-world enterprise SQL logs into high-quality Text-to-SQL benchmarks using human-in-the-loop annotation workflows.

🔍 Overview

Enterprise Text-to-SQL addresses the challenge of building realistic, domain-specific Text-to-SQL datasets by combining:

SQL log mining
Human-in-the-loop annotation
LLM-assisted generation and validation

This system was developed as part of the BENCHPRESS project and supports benchmark creation from internal SQL query logs.

📺 Demo & Deployment

Live Demo: Coming soon
Video Walkthrough: [▶ Watch on YouTube](https://www.youtube.com/coming soon)
Poster Presentation: View the NEDB 2025 Poster(PDF)

📦 Features

✅ Upload and parse enterprise SQL logs
✅ Auto-cluster similar queries using LLM embeddings
✅ Generate natural language annotations with prompt-based LLMs
✅ Verify and edit annotations via an easy-to-use UI
✅ Export clean Text-to-SQL benchmark datasets

🔧 Installation

git clone https://github.com/fabian-wenz/enterprise-txt2sql.git
cd enterprise-txt2sql
pip install -r requirements.txt

Requirements:

Python 3.9+
OpenAI API Key (or compatible LLM provider)
Optional: pgvector for clustering via vector search

🚀 Quickstart

python website/app.py

Then open your browser and go to:
http://localhost:8000

🧠 Annotation Workflow

Project Setup: Create a new annotation project for a specific enterprise workload.
Data Ingestion: Upload SQL logs and schema files, or select a public benchmark (Bird, FIBEN, Spider, Beaver).
Task Configuration: Select annotation direction (SQL→NL) and a language model (e.g., GPT-4o, GPT-3.5, DeepSeek).
(Optional) Decomposition: Split nested SQL into simpler subqueries using CTEs.
Context Retrieval: Retrieve similar annotated examples and relevant tables using dense embeddings.
Candidate Generation: LLM generates 4 NL candidates using retrieval-augmented few-shot prompting.
(Optional) Recomposition: Merge subquery descriptions into a single coherent explanation.
Human Feedback: Annotators rank, edit, or discard LLM outputs.
Review & Export: Export final annotations for training or evaluation; optionally auto-evaluate if ground truth exists.

📁 Project Structure

.
├── demo/                # Screenshots and videos for README
├── website/             # Preprocessing, clustering, and evaluation scripts
  ├── data/                # Sample SQL logs and generated benchmark data
  ├── templates/           # HTML templates for visualizinf the website
  ├── app.py               # Main entry point for the UI
  ├── config.py            # Prompts and LLM interaction
├── requirements.txt     # Python dependencies
└── README.md            # This file

📊 Example Output

{
  "question": "Show the top 10 customers by revenue.",
  "query": "SELECT customer_name FROM sales ORDER BY revenue DESC LIMIT 10"
}

📜 Paper

BENCHPRESS: An Annotation System for Rapid Text-to-SQL Benchmark Curation
Fabian Wenz*, Peter Baile Chen, Moe Kayali, Michael Stonebraker, Cagatay Demiralp
Submitted to CIDR 2026
📄 coming soon on

🙌 Acknowledgements

This project was developed during Fabian Wenz’s time at MIT CSAIL with the support of:

Prof. Michael Stonebraker
Dr. Cagatay Demiralp
Peter Baile Chen
Dr. Nesime Tatbul

🛠️ Contributing

We welcome contributions from the community!

If you encounter bugs, want to request features, or contribute code, please:

Submit an issue
Fork the repo and open a pull request

📄 License

This project is licensed under the MIT License. See LICENSE for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data/fiben		data/fiben
demo		demo
src		src
website		website
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 BenchPress

🔍 Overview

📺 Demo & Deployment

📦 Features

🔧 Installation

🚀 Quickstart

🧠 Annotation Workflow

📁 Project Structure

📊 Example Output

📜 Paper

🙌 Acknowledgements

🛠️ Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 BenchPress

🔍 Overview

📺 Demo & Deployment

📦 Features

🔧 Installation

🚀 Quickstart

🧠 Annotation Workflow

📁 Project Structure

📊 Example Output

📜 Paper

🙌 Acknowledgements

🛠️ Contributing

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages