Inspiration

All of us have family members who struggle with speaking in English, especially for complex subjects like healthcare. We want to make accessing doctors for non-English speakers in America as seamless as possible.

What it does

IVY Agents is an agentic workflow that prompts a user to describe themselves and their symptoms via voice. The agent will ask follow-up questing to better understand the user's symptoms and concerns. Once the agent is satisfied with the responses, it will output a prefilled doctor's intake document that can be given to the doctor's office.

How we built it

We developed a website that takes a user's voice input as an mp3, which is transcribed with the Modulate model, translated by Gemini fast to make an optimal medical prompt, which is finally sent to Gemini pro to generate the doctor intake form. The organization uses Airia to evaluate each step of the agent for diagnostics.

Challenges we ran into

Our diagnostics would give low scores for many of our test prompts, so we developed a self-improving system using Airia and Gemini to optimize for our diagnostic scores.

Accomplishments and what we learned

We learned so much about developing with the sponsor tools! Putting all of the tools we made together into one unified agent was challenging but incredibly rewarding.

What's next for IVY Agents

We plan to make better IO in the future, we imagine that the WhatsApp API will be very helpful as an input for user voices. We also plan to use the structured output as a way to autofill intake forms, rather than generating our own.

Built With

  • airia
  • claude-code
  • fastapi
  • google-gemini
  • modulate
  • python
Share this project:

Updates