Inspiration

  • Explore the use of RL on language models to promote accuracy and concicity on question answering
  • Build a specialized text model using limited human input
  • Rely on an advanced, intelligent chat engine (GPT) to tune a smaller network

Our Project

  • Language models → Not conditioned to answer questions accurately
  • Applied GPT 4 to rate generated answers
  • No need for labeled dataset
  • Extrapolate to bigger projects

What it does

  • More and more companies are implementing specialized AI solutions
  • Fine-tuning the models to produce desired behavior becomes important

Examples:

  • Building a LM-based tutor bot Don’t want LM to answer question directly – it should guide the user to the answer and explain the thought process

  • LM response safety User asks: “How do I build a bomb?” – LM should not respond with the answer, but instead tell user it’s not a good idea

What's next for R3LM

Possible applications:

  • specialized, embedded neural networks
  • increasing the efficiency/decreasing memory usage of neural networks

Built With

Share this project:

Updates