Dynamic LLM

Inspiration

Numerous LLM services exist with varying levels of complexity. User prompts similarly vary in complexity. What if we could match the two? Pairing easier prompt requests with light-weight models saves users on API costs and helps LLM companies with energy costs and load-balancing.

What it does

Our project is a unified chatbot that interfaces with various LLM providers on the backend. Users provide their API keys and type their queries into our chatbot. Based on user feedback, different models rotate in to respond to their queries. We maintain conversational history across LLMs.

How we built it

We used python to make LLM API calls, handle the logic for LLM rotation, and maintain memory in a single conversation. We used JavaScript, Firebase, Flask, and React for our frontend and backend.

Challenges we ran into

Maintaining conversational history across different LLMS proved to be a challenge. We experimented with RAG techniques and summarization tools. We found new models struggling to respond to queries outside the scope of the existing conversation conducted by prior LLMs when using RAG.