Inspiration
We wanted to make music search more intuitive. Currently it is difficult for an individual to actively start exploring songs, mostly relying on searching genre and recommendations. The main bottleneck here is a person's ability to express what kind of music they want - difficult to describe in terms of genres or titles of songs. Our project allows a person to be more expressive - to use emotions and other abstract methods of describing what music they want. We tried to make the search very convenient; the user can specify (one or more of the following): abstract description (e.g. mood, energy in the song), similar songs, and artist, in order to get a playlist of ten to fifty songs.
How it works
Sonance has a Python backend which uses a Small Language Model (SLM) from Sentence Transformers for creation of Mood-aware embeddings. This, as well as connection to Genius for Lyrics and Spotify for other song metadata allow us to create realistic, mood-aware recommendations for a user simply by finding similarities between embeddings we create and user-submitted queries.
Challenges we ran into
One major challenge was the lack of usable metadata. Spotify has deprecated several of its key APIs (recommendations, genres, and sound features), so we had to rely primarily on lyrics via Genius to infer mood and context.
Another challenge came from the embeddings themselves: we noticed that similarity values across songs were almost uniform, making ranking difficult.
We hypothesised this was due to the geometry of high-dimensional embeddings — cosine similarity becomes less discriminative in such spaces. We experimented with higher-order similarity terms to capture subtle semantic differences while maintaining interpretability (i.e., positive semi-definiteness). This only slightly improved differentiation (throughout 50 songs similarities varied by ~5%). It revealed that the metric of similarity needs more work.
We also observed that unrelated prompts — for example, a business report — could achieve unexpectedly high similarity scores. This suggests that the embedding space lacks clear separation between unrelated domains, and highlights the limits of lightweight sentence models for nuanced semantic tasks. While we couldn’t fully resolve this within the hackathon, the model still performed well enough to distinguish broad moods (e.g., happy vs. sad) and generate relevant playlists.
What we learned
We learned that embeddings can behave unintuitively — sometimes showing high similarity even between unrelated prompts. This happens because cosine similarity alone isn’t always expressive enough in high-dimensional spaces, where most vectors tend to be nearly equidistant. This highlighted the need for more sophisticated similarity measures or domain-specific fine-tuning to better capture nuanced meaning.
We also realized that working with lyrics instead of raw audio was the right design choice for a language-based model. Extracting semantic meaning directly from audio would require a completely different approach (e.g., spectrogram analysis, multimodal models) and would be significantly harder to implement in a short timeframe.
What's next for Sonance
Scalability and user-friendliness are out first priorities at Sonance. We plan to expand our music database by integrating additional APIs and open-source lyric datasets and enhance personalization through user mood tracking and online, reinforcement learning based preference learning. We also want to have real-time user mood detection by analyzing writing style, voice input, or facial expressions to infer mood and deliver content even without the need for explicit mood input. We also want to develop a sleek-modern app for IOS/Android, since we want for you to be able to use Sonance on the go, whether you are going for a jog or walking with your dog. Ultimately, we envision Sonance as a companion that not only plays music but truly understands it, just like you do.
Built With
- genius
- huggingface
- next.js
- python
- spotify
- typescript
Log in or sign up for Devpost to join the conversation.