Inspiration
- Explore the use of RL on language models to promote accuracy and concicity on question answering
- Build a specialized text model using limited human input
- Rely on an advanced, intelligent chat engine (GPT) to tune a smaller network
Our Project
- Language models → Not conditioned to answer questions accurately
- Applied GPT 4 to rate generated answers
- No need for labeled dataset
- Extrapolate to bigger projects
What it does
- More and more companies are implementing specialized AI solutions
- Fine-tuning the models to produce desired behavior becomes important
Examples:
Building a LM-based tutor bot Don’t want LM to answer question directly – it should guide the user to the answer and explain the thought process
LM response safety User asks: “How do I build a bomb?” – LM should not respond with the answer, but instead tell user it’s not a good idea
What's next for R3LM
Possible applications:
- specialized, embedded neural networks
- increasing the efficiency/decreasing memory usage of neural networks
Log in or sign up for Devpost to join the conversation.