TL; DR; (Simple Summary)
A functional keyboard GUI (4 octaves spanning from C1 to C5) complete with two ML pipelines: one that’s text to music and the other that’s music to new music.
Inspiration
One of our group members makes music for fun. The rest of us are also music enthusiasts with a very diverse taste overall. However, the main catalyst was when we were all discussing music production and how a producer would overcome mind block on how to finish melodies. This ultimately led to our idea to create a program that could create more music.
Overview
We created a front end interface that had a piano keyboard on it. The user inputs musical notes by clicking on the piano keys. The user can also press the record button to record what sounds they made and replay them. After playing a few of the keys, the user can press the generate button and then the AI would generate music that was inspired by it. It even takes pitch and rhythm into account! Besides the generate button for MIDI to MIDI, the user can also input text like "upbeat happy tune" and then the AI will try to generate such a tune. Since piano assisted melody generation isn’t heard of much, TuneTuahNote would be extremely beneficial in finishing the melodies that people are unable to complete. This would definitely benefit musicians and music enthusiasts everywhere.
Details
We used what’s called a Markov chain to predict what note comes next for both of our ML pipelines. Essentially, the output for the current MIDI node is dependent on the previous MIDI node only. It determines what note to use based on probabilities and these probabilities vary based on what preceded it and the guidelines it adheres to. What we believe sets our algorithm apart is the use of music theory as well. For both pipelines, especially the MIDI to MIDI, the guidelines are determined by what notes would create a better harmony with the existing notes. We also used natural language processing to process the inputs for the text to MIDI pipeline, as well as including an LSTM neural network for MIDI generation. For example, our text to MIDI pipeline determines guidelines for tempo and whether the key is major or minor. These guidelines affect the Markov model, and by extension the probabilities. Because of this, our program is highly dynamic and adaptable to multiple different inputs.
Tools and Languages used
We built this using TensorFlow, PyTorch, Python, CSS, HTML, JavaScript, and with a locally ran DeepSeek R1 model. We also used Git (and GitHub) heavily for project collaboration.
Challenge
We ran into a lot of problems regarding dependancies. Whenever we tried to run some code, sometimes the program would not work because the proper dependancies were not installed or the proper libraries were not imported. Some of these dependencies, like FluidSynth and Magenta, were outdated and we had to backtrack. This is actually what led us to the Markov model. Also, sometimes the AI had trouble generating a new melody and it would generate the same melody for different prompts like a "happy and upbeat tune" and a "dark and foreboding melody."
Accomplishments that we're proud of
We were able to create and manage our own local server. We were able to create a nice frontend website and we could implement recurrent neural networks and Markov chains for audio generation.
Learning Outcomes
We learned more about nueral networks and we learned about Markov chains too. For those of us who worked on the frontend, we learned a lot about CSS, HTML, and JavaScript. Our skills regarding web development certainly grew. For those of us who worked on the backend, we learned how to implement our own local server and how to use neural networks to analyze audio inputs to help create new audio.
What's next for TuneTuahNote
We want to work on having the user be able to download these new melodies that either they or the AI created via the press of a button.
Built With
- css
- html
- javascript
- python
- pytorch
- tensorflow
Log in or sign up for Devpost to join the conversation.