Prompt2Data is a conversational AI tool that turns natural language prompts into structured data specifications. It helps automate dataset generation by capturing goals, schema, and metadata — starting with an intelligent prompt interpreter powered by Claude 4.
Devpost Link - https://devpost.com/software/prompt2data-autonomous-data-pipeline?ref_content=my-projects-tab&ref_feature=my_projects
Think of it as ChatGPT for dataset design and pipeline automation.
- 🧠 Prompt-to-JSON task parsing using Claude 4 (Anthropic)
- 💬 Continuous chat context using Streamlit
- 📦 Outputs a fully structured spec: goal, data fields, location, time, format
- 🔁 Remembers past instructions, updates specs incrementally
- ✅ Plug-in ready for multi-agent expansion (dataset search, cleaning, etc.)
- Python 3.9+
- Streamlit (for UI)
- Anthropic Claude API (for intent parsing)
- Dotenv (for API key handling)
- Python 3.9+
- A GitHub account (for cloning)
- A free Anthropic Claude API Key
git clone https://github.com/your-username/Prompt2Data.git
cd Prompt2Datapython -m venv venv
venv\Scripts\activatepip install streamlit anthropic python-dotenvIn your root folder, add a new file called .env:
CLAUDE_API_KEY=sk-ant-xxxxxxxxxxxxxxxxxxxx
Prompt2Data/
├── venv/
├── .env
├── intent_ui.py
├── intent_agent.py
├── requirements.txt
streamlit run intent_ui.pyIt will open a browser window at:
http://localhost:8501
I want data on housing prices and crime rates in California
Make it from 2010 to 2020
Output should be a CSV
Add air quality if available
Each message updates your task spec!
PRs and ideas welcome — DM us or submit issues if you’d like to extend agents or plug in other APIs!