Inspiration
Scrolling through your playlists… Have you felt that you can’t find just THAT right song?
We may have a solution.
Introducing…BoilerCompose
Leveraging state-of-the-art AI technology, we introduce a revolutionary new way to compose music built for the 21st century.
Goodbye, Kendrick. Hello, BoilerCompose.
Our technology builts on the comfortable chat format. Users may submit photos and text, and BoilerCompose captures the moment in a carefully designed song. BoilerCompose can even generate some sheet music!
How we built it
We developed a complex data translation and ingestion pipeline: First, our interface accepts media from the user. Using OpenAI’s multimodal GPT-4o vision models, we interpret sentiment, mood, and setting—key factors in music perception. From there, the user gets the generated prompt back from the LLM to confirm the vision! Second, this interpretation is then processed by various models. Our “sheet music” creation relies on our lean Tensorflow audio-to-MIDI converter, built on Spotify’s open source “Basic Pitch” development, and the “mingus” Python music notation library.
Our visual interface is built using React and Vite for the best in reactive applications.
All of this is packaged into a convenient web app for use, secured with Okta’s Auth0 platform.
Challenges we ran into
We had to overcome numerous missteps building complex data pipelines, converting image to text, to music, to MIDI, to PDF formats. It took tons of research and development to find the most suitable technologies for this project while also maintaining efficiency programmatically so the experience is seamless.
Accomplishments that we're proud of
We’re proud of our work wrangling diverse file formats. Complex formats such as audio, image, and text each have challenges. Combining all these only multiplied these difficulties.
Where 15th century Europe had Gutenberg, the 21st has Edgar Babajanyan. As true visionaries, our team has revolutionized the world of music production through a truly innovative application of GenAI. Phased out are the Michael Jacksons of the world, rendered useless as scribes in the wake of the printing press. The Information Age is over. Welcome to year 0 B.C. (BoilerCompose).
What’s next for BoilerCompose? This time, we faced the challenge of transcribing pure audio recordings into MIDI format files, which serve as a key step in the process of transcribing audio to human-readable sheet music. In fact, a conversion program that produces truly precise MIDI files given audio with multiple instruments has yet to be developed by anyone. We plan to tackle this challenge with full force using an approach based on Google’s MT3 transformer-based transcription model. We also see other types of file inputs such as mp3, PDFs containing inspiring sheet music, videos for longer format visual input, and even CSVs incase data scientists want their data to become music! For this to become public, we'll take the codebase we have and build a database so users can store their music, sheets, and any other files to build off of! We think this may become something very exciting.
Built With
- auth0
- cloudflare
- css
- html
- lilypond
- openai
- python
- react
- tensorflow
- vite
Log in or sign up for Devpost to join the conversation.