Inspiration
All of us have family members who struggle with speaking in English, especially for complex subjects like healthcare. We want to make accessing doctors for non-English speakers in America as seamless as possible.
What it does
IVY Agents is an agentic workflow that prompts a user to describe themselves and their symptoms via voice. The agent will ask follow-up questing to better understand the user's symptoms and concerns. Once the agent is satisfied with the responses, it will output a prefilled doctor's intake document that can be given to the doctor's office.
How we built it
We developed a website that takes a user's voice input as an mp3, which is transcribed with the Modulate model, translated by Gemini fast to make an optimal medical prompt, which is finally sent to Gemini pro to generate the doctor intake form. The organization uses Airia to evaluate each step of the agent for diagnostics.
Challenges we ran into
Our diagnostics would give low scores for many of our test prompts, so we developed a self-improving system using Airia and Gemini to optimize for our diagnostic scores.
Accomplishments and what we learned
We learned so much about developing with the sponsor tools! Putting all of the tools we made together into one unified agent was challenging but incredibly rewarding.
What's next for IVY Agents
We plan to make better IO in the future, we imagine that the WhatsApp API will be very helpful as an input for user voices. We also plan to use the structured output as a way to autofill intake forms, rather than generating our own.
Built With
- airia
- claude-code
- fastapi
- google-gemini
- modulate
- python

Log in or sign up for Devpost to join the conversation.