Skip to content

[C4GT] Performance, Cost Optimization, Benchmarking #28

@ChakshuGautam

Description

@ChakshuGautam

Project Details

Text2SQL is an application that allows users to interact with their data using natural language queries. Currently, it only supports SQL-based querying but the implementation is not limited to that. Text2SQL provides APIs to generate the appropriate query (SQL or otherwise) and return the data you need.

Features to be implemented

Token Optimization

Improve token usage with OpenAI

Alternate Models Evaluation

Models to be evaluated

Domian Mapping to Schema

  • Solve for cases when the DB/Tables are not having intuitive names
  • Solve for cases where the data in a dataset is needed to figure out viable filters

Test Cases/Benchmarking

Add public test cases to test out the current model.

Learning Path

Complexity

Complex

Skills Required

Python, Knowledge of HuggingFace Transformers, NLP, SQL, Databases.

Name of Mentors:

@ChakshuGautam

Project size

8 Weeks

Product Set Up

See the setup here

Acceptance Criteria

  • Evaluation Matrix of Model vs Use Case
  • Solve for a single Education domain and test if on a new schema
  • Run test cases and update benchmarks
  • Token usage chart to be shared showing improvements on benchmarks with smaller prompts

C4GT

This issue is nominated for Code for GovTech (C4GT) 2023 edition.
C4GT is India's first annual coding program to create a community that can build and contribute to global Digital Public Goods. If you want to use Open Source GovTech to create impact, then this is the opportunity for you! More about C4GT here: https://codeforgovtech.in/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions