Upload a recording of yourself playing, add the sheet music, and get a note-by-note breakdown of your pitch accuracy and timing. Built for violin, viola, cello, and double bass.
| Layer | Tech |
|---|---|
| Frontend | React 18 + Vite, TailwindCSS, Framer Motion, Recharts |
| Backend | FastAPI + uvicorn |
| ML models | PyTorch (BiLSTM + FFN), trained on the URMP dataset |
| Audio analysis | librosa (pyin pitch tracking), music21 (MusicXML parsing) |
pip install -r requirements.txt
uvicorn backend.app:app --host 0.0.0.0 --port 8000 --reloadcd frontend
npm install
npm run devOpen http://localhost:5173.
StringSync/
+-- backend/ ? FastAPI server + inference pipeline
+-- frontend/ ? React app
+-- training/ ? model architecture, training scripts, saved weights
¦ +-- outputs/
¦ +-- models/ ? model_a_best.pt, model_b_best.pt + calibration artefacts
¦ +-- features/ ? pre-extracted train/val/test feature pickles
¦ +-- logs/ ? training curves, PR/ROC curves, calibration plots
+-- scripts/ ? dataset acquisition pipeline (for retraining)
+-- tests/ ? inference validation tests
The trained weights are already committed under training/outputs/models/ so
you do not need to retrain to run the app. If you want to retrain:
| Dataset | What it is used for | Link |
|---|---|---|
| URMP | Isolated string instrument WAV stems (ground-truth performances) | Zenodo 5045435 — click Download all |
| MAESTRO v3.0.0 | Structural reference only (no audio needed) | magenta.tensorflow.org/datasets/maestro |
| Public-domain MusicXML | Score files for paired audio | Auto-downloaded from the music21 corpus via scripts/download_musicxml.py |
The URMP full archive is ~12.5 GB. Extract it into
string_performance_dataset/raw_downloads/urmp/before running the pipeline. Or runpython scripts/download_urmp.pyfor the sample set.
python scripts/run_pipeline.pyDownloads, aligns, and quality-filters everything into string_performance_dataset/.
python training/run_training.pyRuns steps 1-4 (split -> features -> train -> evaluate). New weights and
calibration artefacts are saved to training/outputs/models/.
overall = 0.55 x pitch_accuracy + 0.45 x duration_accuracy
Both accuracy values are percentages (0-100).
Python 3.11+ recommended. Key packages: torch, librosa, music21,
fastapi, uvicorn, numpy, scikit-learn. Full list in requirements.txt.