StringSync

Upload a recording of yourself playing, add the sheet music, and get a note-by-note breakdown of your pitch accuracy and timing. Built for violin, viola, cello, and double bass.

Stack

Layer	Tech
Frontend	React 18 + Vite, TailwindCSS, Framer Motion, Recharts
Backend	FastAPI + uvicorn
ML models	PyTorch (BiLSTM + FFN), trained on the URMP dataset
Audio analysis	librosa (pyin pitch tracking), music21 (MusicXML parsing)

Quick start

1. Backend

pip install -r requirements.txt
uvicorn backend.app:app --host 0.0.0.0 --port 8000 --reload

2. Frontend

cd frontend
npm install
npm run dev

Open http://localhost:5173.

Project layout

StringSync/
+-- backend/          ? FastAPI server + inference pipeline
+-- frontend/         ? React app
+-- training/         ? model architecture, training scripts, saved weights
¦   +-- outputs/
¦       +-- models/   ? model_a_best.pt, model_b_best.pt + calibration artefacts
¦       +-- features/ ? pre-extracted train/val/test feature pickles
¦       +-- logs/     ? training curves, PR/ROC curves, calibration plots
+-- scripts/          ? dataset acquisition pipeline (for retraining)
+-- tests/            ? inference validation tests

Retraining from scratch

The trained weights are already committed under training/outputs/models/ so you do not need to retrain to run the app. If you want to retrain:

1. Download the data

Dataset	What it is used for	Link
URMP	Isolated string instrument WAV stems (ground-truth performances)	Zenodo 5045435 — click Download all
MAESTRO v3.0.0	Structural reference only (no audio needed)	magenta.tensorflow.org/datasets/maestro
Public-domain MusicXML	Score files for paired audio	Auto-downloaded from the music21 corpus via `scripts/download_musicxml.py`

The URMP full archive is ~12.5 GB. Extract it into string_performance_dataset/raw_downloads/urmp/ before running the pipeline. Or run python scripts/download_urmp.py for the sample set.

2. Build the dataset

python scripts/run_pipeline.py

Downloads, aligns, and quality-filters everything into string_performance_dataset/.

3. Train

python training/run_training.py

Runs steps 1-4 (split -> features -> train -> evaluate). New weights and calibration artefacts are saved to training/outputs/models/.

Score formula

overall = 0.55 x pitch_accuracy + 0.45 x duration_accuracy

Both accuracy values are percentages (0-100).

Environment

Python 3.11+ recommended. Key packages: torch, librosa, music21, fastapi, uvicorn, numpy, scikit-learn. Full list in requirements.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
frontend		frontend
scripts		scripts
tests		tests
training		training
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
debug_eval.py		debug_eval.py
debug_labels.py		debug_labels.py
pip_list.txt		pip_list.txt
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py
run_test.py		run_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StringSync

Stack

Quick start

1. Backend

2. Frontend

Project layout

Retraining from scratch

1. Download the data

2. Build the dataset

3. Train

Score formula

Environment

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

StringSync

Stack

Quick start

1. Backend

2. Frontend

Project layout

Retraining from scratch

1. Download the data

2. Build the dataset

3. Train

Score formula

Environment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages