-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Project Details
Text2SQL is an application that allows users to interact with their data using natural language queries. Currently, it only supports SQL-based querying but the implementation is not limited to that. Text2SQL provides APIs to generate the appropriate query (SQL or otherwise) and return the data you need.
Features to be implemented
Token Optimization
Improve token usage with OpenAI
Alternate Models Evaluation
Models to be evaluated
- WikiSQL has a some models that can used
- Spider and SparC has some more of these
- RAT is a brilliant implementation of this
- Implement DSP #36
- Evaluate SQL-PALM #37
Domian Mapping to Schema
- Solve for cases when the DB/Tables are not having intuitive names
- Solve for cases where the data in a dataset is needed to figure out viable filters
Test Cases/Benchmarking
Add public test cases to test out the current model.
Learning Path
Complexity
Complex
Skills Required
Python, Knowledge of HuggingFace Transformers, NLP, SQL, Databases.
Name of Mentors:
Project size
8 Weeks
Product Set Up
See the setup here
Acceptance Criteria
- Evaluation Matrix of Model vs Use Case
- Solve for a single Education domain and test if on a new schema
- Run test cases and update benchmarks
- Token usage chart to be shared showing improvements on benchmarks with smaller prompts
C4GT
This issue is nominated for Code for GovTech (C4GT) 2023 edition.
C4GT is India's first annual coding program to create a community that can build and contribute to global Digital Public Goods. If you want to use Open Source GovTech to create impact, then this is the opportunity for you! More about C4GT here: https://codeforgovtech.in/