Inspiration
We both have been interested in practicing improvising on the guitar. Typically, you would play along with a backing track on YouTube, but this can be limiting if you want to try out a specific style in a specific key. In addition, it's difficult to get feedback on your improvisation if you don't have a coach with you. We realized that there is a need for such a tool that would let you generate backing tracks and give feedback on improvisation.
What it does
ImproviGator allows the user to select from a list of templates or specify the type of backing track they want generated, including key, instruments, and tempo. ImproviGator will provide a Guitar scale in the key specified for guidance on which notes to play.
The user can record themselves improvising over the backing track, which will be analyzed by an AI agent for qualities such as melodic contour, how cohesive the melody is, and if the notes were played in the correct scale or not and will provide personalized feedback and suggestions for improvement. The AI chat can also show chord diagrams to be saved for reference while improvising and can modify the generated backing track as needed.
How we built it
The backing track is generated by AI agents using Strudel, a programming language designed for creating live music. The AI agents are powered by Gemini and use a custom MCP tool pipeline for controlling playback, chord diagram generation, accessing improvisation analyses, accessing Strudel documentation, and verifying that the Strudel code compiles. We used tRPC routing for simple communication between the client and server for operation of the AI agents.
For note recognition, we used open-source library developed by Spotify called Basic Pitch. This converts the recording into a MIDI file. Since LLMs are not sophisticated enough to complete a full analysis of MIDI files on their own, we first use functions that summarize features of the audio, such as the number of large vs small jumps in the melody, the predictability, and the percentage of notes played in the correct scale. This is fed to the AI agent to aid in its analysis.
To display the guitar fretboard diagram, we used a library called react-guitar.
Challenges we ran into
Because there aren’t many examples online of Strudel code for LLMs such as Gemini to train on, their performance is very poor for generating music. Providing the entirety of Strudel documentation to the AI made generation very slow, so we created a condensed AI summary of Strudel documentation to provide to the models along with examples to follow and avoid. This significantly improved the quality of output. Another challenge we faced was determining the most intuitive layout for the website. There are a large number of states that the site could be in and many elements to show to the user, so finding an efficient layout was difficult. Ultimately, we sketched out many iterations of the design before settling on our final design.
Accomplishments that we're proud of
Our biggest accomplishment was enabling an AI agent to do so many tasks and to teach it how to use a new programming language it did not have much training on. We had not done LLM MCP tool calling before this project, so developing the tools for music generation, modification, analysis, and diagram generation was a big achievement.
What's next for ImproviGator
In the future, we will provide Strudel documentation to the model using RAG, which allows the AI to access the documentation information without needing for it to be included in the prompt, further improving music generation quality. In addition, we will analyze the audio for more features, such as dynamics and rhythm for more detailed feedback.
Log in or sign up for Devpost to join the conversation.