Inspiration

A significant amount of our peers have had rough journeys finding significant others. We noticed a common trope in most dating apps where people are often matched based on superficial data, making online dating a detremental experience for many. We believe the best way to connect is through genuine activities and shared bonding experiences, like singing.

What It Does

Our app and machine Stereolove matches users based on their combined singing characteristics. There's both an online version that resembles more of a dating app, and an offline version that works more like a karaoke machine bringing people together in person. Users simply must sing their favorite song. Our system compares characteristics of their singing voices with a reference voice from the actual music track to generate a score. If that score is high, users get matched. Bonus points if the song is a duet.

How we built it

We built the main aspects of the project using Python libraries. For audio voice seperation, we used the transformer model sepFormer from the speechBrain library. It analyzes frequency patterns in mixed audio signals with the goal of masking one voice, while omitting other. We also used openVINO to quantize the model and reduce it's operating resources. This class generates 2 audio files, one for each singer, and is very useful for duet karaoke songs especially when using one microphone.

For scoring analysis, we used Librosa for music information retrieval. We split the target song into small segments of consistent music, using it's data like tone, pitch, and timbre to perform our scoring analysis.

For lyric display, we used GeniusAPI to assign an artist to each individual lyric for duet songs. We also used Pyqt5 for the app GUI. We also attempted to use this setup to perform lyric highlighting.

Finally, we used an arduino to display the lyrical data on our karaoke machine using an LCD. There are 2 LEDS, each lighting up in synchronization with each singer's turn. The LCD is used to display the state of the current session, and is used to display scoring at the end.

Challenges we ran into

We ran into several challenges throughout this journey. First of all in the hardware setup, the microphones we needed were out of stock. We then purchased some microphones, only to realize that there is a specific kind of microphone board for arduino that prefilters the signal to allow for propper audio capturing, whereas the board we bought only captured the amplitude of surrounding sounds. We overcame this by giving our system dual platform, phone + karaoke, machine support.

We attempted to perform lyric highlighting to help users remain on beat and in sync with the music, however this proved to be quite difficult for several reasons. First, the human perception of the music must be delayed to get the right feel of perception. Second, there are so many different structures for songs out there, an implementation for one song would not work so well on a different song.

Furthermore, when performing audio seperation, the transformer model proved quite difficult to run. Our app would need to be beefed up in compute performance, or we would need to further optimize this model going forward. We temporarily solved this issue through quantizing the model to use

Accomplishments that we're proud of

We are proud of all the significant hurdles we have overcome, even though many remain. Through all the failures we were able to learn a significant amount of experience from working on everything from the front-end retrieval of sound to the compute heavy back-end processing.

What's next for Stereolove

Going forward, we'd like to integrate all our models into one coherent program. We'd prefer to start with a karaoke machine that runs locally, and then port the program into a server that we can use with phone webapps to continue expanding our platform diversity.

Built With

Share this project:

Updates