Inspiration

Ashar and Wenqi were sitting in a restaurant after an argument, and Wenqi while trying to relax looked at his phone and saw the Apple music icon He thought it would be cool if people not well-versed in music could create and customize their music without learning the different jargon and techniques for each instrument after countless exhausting tutorials.

What it does

User can create and customize their music using different variables like note, temperature, and seed, and also select the musical instruments. And also users can then see their created music album and share it with others and also see others. To add the cherry on top, the user can later generate the album cover using DALL E to make it a complete professional composition.

How we built it

In the development of our music generation model, we initiated the process by sourcing data from an online MIDI dataset. This dataset underwent a thorough preprocessing phase to optimize it for our analysis. Our initial approach involved retaining all intrinsic properties of the MIDI files, including velocity, notes, and timing. We then input this data into various models for generation: OpenAI's ChatGPT, a bespoke non-ChatGPT model, and our custom autoregressive model. However, this approach did not yield satisfactory results.

To enhance the efficacy of our model, we explored the concept of data simplification. This involved a strategic reduction of the musical elements to focus primarily on the essential components. Initially, we experimented with preserving just the note and timing information. Further refinement led us to a more minimalist approach, where we retained only the note data, omitting the timing aspect altogether.

Upon reprocessing our dataset with these modifications, we observed a significant improvement in music generation when utilizing both the ChatGPT model and our proprietary autoregressive model. Our custom model is architecturally composed of an embedding layer, a transformer mechanism, and a specialized output head. The details of this architecture, along with the comparative performance metrics, are elucidated in the accompanying graphical documentation.

Challenges we ran into

Ashar had difficulty pushing his files to the remote repository since he accidentally created a duplicate file, but thankfully Wenqi was able to resolve that issue without much data loss. Ashar and Abhishek also ran into problems when labeling and giving short descriptions of the training data as the terminology and jargon required to describe each music was difficult. Shangzhen wasn't able to create animations using WebGL thus we had to scrap that idea. Abhishek had difficulty implementing buttons and pictures onto the webpage since he never used Ionic before. Wenqi's initial attempt at training the data like OpenAI's ChatGPT, a bespoke non-ChatGPT model, and our custom autoregressive model didn't produce satisfactory results. The initial training costed us over 200 CAD as well.

Accomplishments that we're proud of

We are glad to help people suffering from mental problems since our project has therapeutic effects on mental and emotional well-being since users can customize it according to their mood. It can be used in music therapy sessions to help individuals cope with stress, anxiety, or depression Also for individuals with communication difficulties, such as those with non-verbal autism or severe physical disabilities, our project can provide an alternative means of expression. Moreover, it can be designed to improve motor skills. This is particularly beneficial for individuals with physical disabilities, helping them enhance coordination, dexterity, and control by interacting with digital interfaces or musical instruments. Also, this application can be incorporated into educational games to make a more engaging learning environment. Thus demonstrating the versatility of our app.

What we learned

Abhishek and Shangzhen expanded their technical expertise in front-end development by mastering Ionic React. They also honed their skills in high-fidelity prototyping using Figma, and delved into WebGL for advanced animation, although its implementation was constrained by time limitations. Ashar, on the other hand, focused on enhancing efficiency in front-end development, acquiring proficiency in advanced Ionic React elements. He particularly concentrated on learning about state management and the utilization of 'forEach' loops for optimizing the rendering of music ionic cards, a significant improvement over the manual, individual placement method.

In addition to front-end development, our team recently embarked on learning about transformers, applying this knowledge effectively in our sequence generation project. We also adopted the Waterfall Model, an approach we studied in our COSC 310 course, applying it methodically to our project management processes.

Our exploration extended into the realm of Physics, where we delved into fundamental concepts crucial for sound manipulation. This included an in-depth study of velocity, time, note, pitch, and amplitude. This comprehensive understanding of sound dynamics was critical for our project, as it guided us in the process of selecting and utilizing VLC - the only audio player compatible with .mid files in our context. We tested audio playback at various speeds using VLC, rigorously evaluating the quality to ensure the output was of a standard deemed enjoyable for human listeners.

What's next for Merry Multi Melody

We intend to add more instruments to the options. And also add braille features for the visually impaired. Moreover, we also want to incorporate GPT 4 and train it with lyrics of thousands of songs with several audio clips of famous singers so that it can produce the "perfect singer singing the perfect song of any genre".Lastly, we want to add functionalities for multiple users to collaborate on a single composition in real time, similar to collaborative document editing tools. This would be especially beneficial for bands or groups of musicians working remotely.

Built With

Share this project:

Updates