Inspiration
I don't know about you, but I listen to music very often. Nothing annoys me more than a Spotify shuffle which plays the wrong songs at the wrong time — forcing me to switch back to the application every few minutes to find something new to play, and losing focus.
Tired of this repeating problem in my work-while-listening routine, I needed something which knows what to play and when to play it.
Enter DeepJ, an AI-based DJ capable of detecting your mood and ambiance, and finding just the right songs for you anytime!
What it does
DeepJ uses camera input to continuously scan for your mood. Then, getting a good feel for what to play, it will enqueue songs with an appropriate feel for the situation, all while maintaining the essential music-listening features such as skipping, pausing, and moving throughout a song itself. Tired of listening to mainstream music, or even what you've been listening to lately? Try DeepJ's mood-based music generation. With the mood and ambiance it detects, it will generate the appropriate music to complement whatever it is you're doing, all from one click away!
How we built it
First, we implemented a front-end capable of accessing the camera. We implemented basic functionality like the pause, skip, and turn-off-camera buttons, then kept on polishing and adding to the front end while developing the AI backbone of our application. That is, we fed camera and microphone data to the Gemini Live API, which we used to perform sentiment analysis and ambiance analysis from the listener's surrounding environments. We then used Gemini models once again to choose appropriate songs for the user's detected mood, bridging between mood to genres and choosing between songs. Then, we also implemented Lyria to create custom music with respect to the detected genre on demand. Putting it all together, we pulled the AI models together in order to create a queue of songs for the listener depending on their mood through the use of data structures like a double-ended queue and a linked list. Finally, we hooked up the AI generated queue to the front-end buttons we originally created, forming the seamlessly integrated DeepJ application we sought to create!
Challenges we ran into
We ran into quite a few challenges with sentiment analysis, and its matching to an appropriate genre. As this was a two-step process, and as we were essentially asking Gemini to use computer vision for sentiment analysis, the double-querying often had errors on choosing the song even despite determining the mood correctly. Furthermore, we had to adjust the settings of our use of Lyria in order to generate the appropriate music for each genre when prompted. Tweaking each instrument's among other factors let us "emulate" the genres we desired to create for. Finally, putting it together into a comprehensive front-end brought many linking challenges, which all had to be fixed sequentially.
Accomplishments that we're proud of
We are proud of having been able to find a way to create appropriate Lyria-generated music for each genre when prompted. Also, we are pretty proud of the double querying-process that our Gemini integration could provide; turning an image into a good song for the moment is a pretty phenomenal thing to do. Finally, we're proud of the front-end functionality we were able to implement. Using Google Cloud Platform (GCP), we were able to stream music stored in bucket storage (GCS) without even downloading it. Our front-end looks really nice, and we're happy to be working with it.
What we learned
First of all, we all learned to adjust together and work as a team. Even though we all had a clear idea of what to build, we all slightly differed in our visions of tiny features to implement, and in how to go about doing what we were going to do. We came together and discussed things as a team, then assigned tasks based on each other's strengths. We also had regular check-ins for merging codebase branches, resolving code conflicts from different functionalities, and worked together on different parts of the project whenever two parts came together at one end. We also learned quite a lot about sentiment analysis, especially with Gemini, and for those of us who weren't so strong with React.js, this was definitely great exposure to it and a good learning process. Finally, we did learn how to manage everything from a JavaScript application to avoid latency and keep as much of the functionality on the user's end.
What's next for DeepJ
DeepJ will continue to grow, implementing ever-growing playlists with probabilistic weighting of songs for adding similar songs and replacing songs. We also want to implement a recommendation system, based on what users typically listen to, which would involve a kind of user-recognition system. With stronger foundation models, we would use spectrogram analysis to create smooth AI-generated musical transitions between songs. Finally, with more time, we would implement a way to stream directly from Spotify using voice commands.


Log in or sign up for Devpost to join the conversation.