Inspiration
As musicians, we’ve spent countless hours refining our musical phrasing but have always found it difficult to get real-time, visual feedback on subtle nuances like timing, dynamics, and articulation. We wanted to create a tool that bridges the gap between auditory and visual learning, helping musicians analyze their performances more intuitively. Many musicians rely on their ears alone, but we saw an opportunity to leverage technology to provide more precise, objective feedback.
What it does
TuneSync allows musicians to upload their recordings and compare them against professional performances through waveform visualization. The web app provides:
- Seamless Youtube clip to waveform conversion to visually analyze phrasing, timing, and expression.
- A moving playback cursor glides through the waveform, making it easy to see shifts in volume and intensity as the audio progresses.
- Recording functionality so users can capture their performance in real time.
- Automated waveform alignment to match recordings for more accurate comparisons.
- Personalized phrasing and expression analysis, giving AI-driven insights into a user’s playing.
- Performance scoring powered by our machine learning, offering quantitative feedback on key musical elements like pitch accuracy, rhythmic consistency, and dynamic control.
How we built it
Our application is built using a Flask backend and a React frontend, with Express.js handling additional API interactions. We utilize the Web Audio API for processing and visualizing audio data, allowing users to analyze phrasing, timing, and expression in musical performances.
Performance Scoring System: To provide musicians with objective feedback, we developed a machine learning model using a Random Forest Regressor. This system evaluates performances based on audio features extracted with the librosa library. Key features include:
- Pitch: Mean and standard deviation from piptrack analysis.
- Rhythm: Tempo and beat strength derived from onset and beat tracking.
- Dynamic Range: RMS energy range and variability.
- Timbre: 13-dimensional MFCCs capturing tonal qualities.
- Spectral Properties: Spectral centroid and brightness variation.
We optimized the model using GridSearchCV for hyperparameter tuning (number of trees, depth, and split criteria) and evaluated it using an 80/20 train-test split with R² scoring. The model outputs a performance score (0-100) and a confidence interval based on variance in tree predictions, providing nuanced insights into a musician’s phrasing, timing, and expression.
Challenges we ran into
One major challenge was handling real-time audio visualization efficiently while maintaining smooth performance. Synchronizing the moving cursor with playback required fine-tuning the timing logic to prevent lag or misalignment. Additionally, ensuring accurate waveform comparisons between different recordings took significant trial and error, as variations in tempo and articulation made direct alignment difficult. We had to refine our waveform-matching algorithm to make the visual comparison meaningful.
Accomplishments that we're proud of
- Successfully implementing real-time waveform visualization and synchronization.
- Creating an intuitive UI that makes waveform comparison easy and useful for musicians.
- Overcoming technical challenges in audio processing and alignment, ensuring smooth and accurate playback.
- Building a functional and scalable full-stack web application that bridges music and technology.
- Overcoming technical challenges in audio processing, alignment, and ML model optimization.
What we learned
Through this project, we deepened our understanding of:
- Audio processing and waveform visualization, especially in a real-time application.
- Frontend-backend integration, ensuring smooth data transfer and performance.
- Machine learning model development, from hyperparameter tuning to performance evaluation.
- User experience design for musicians, optimizing the interface for clarity and ease of use.
- Algorithmic audio alignment, which required research and experimentation to get right.
What's next for TuneSync
We see a lot of potential for TuneSync beyond its current features:
- Expanding the ML model to support a broader range of instruments and musical styles.
- Support for more instruments and styles, expanding beyond classical music.
- Cloud-based saving and sharing, allowing musicians to store and compare performances over time.
- Collaboration features, enabling teachers and students to review performances remotely. Our goal is to make TuneSync an essential tool for musicians looking to refine their artistry through both sound and sight.
Built With
- flask
- flask-cors
- javascript
- librosa
- numpy
- python
- react
- sklearn
Log in or sign up for Devpost to join the conversation.