MusicGenreClassification

Technion ECE 046211 - Deep Learning

MusicGenreClassification

Background

For our final project in the Technion DL course (046211), we chose to classify music genres over the GTZAN dataset.
The approach to the problem is using a pre-trained Wav2Vec2 transformer model.
As appose to most existing models, a transformer will use the raw time series data which is the reason we predicted an improvement over existing methods

The Model

We used the model facebook/wav2vec2-large-100k-voxpopuli from huggingface, Facebooks Wav2Vec2 model pre-trained on 100k unlabeled subset of speech data.

Dataset

We used the femiliar GTZAN dataset.
The dataset consists of 1000 audio tracks each 30 seconds long.
It contains 10 genres, each represented by 100 tracks:
The genres are: blues, classical, country, disco, hiphop, jazz, metal, pop, reggae, rock
The tracks are all 22050Hz Mono 16-bit audio files in .wav format.

Agenda

File	Purpsoe
`img`	Contains images for README.md file
`train_30s_model.py`	train the model on 30s tracks
`train_15s_model.py`	train the model on 15s tracks
`train_10s_model.py`	train the model on 10s tracks
`eval_model.py`	evaluate the model
`rolling_stones.wav`	example audio file

Results

30s model

The model was trained on 30s tracks.
performance:
87% accuracy on validation set

77% accuracy on test set

15s model

The model was trained on 15s long tracks. Each 30s track was divided into 2 sub-tracks 15s long
performance:
78.85% accuracy on validation set

75.5% accuracy on test set

10s model

The model was trained on 10s tracks. Each 30s track was divided into 3 sub-tracks 10s long
performance:

78% accuracy on validation set

74.5% accuracy on test set

Docker

The project is intended to run in huggingface docker image
For instructions on how to install docker:
https://docs.docker.com/engine/install/

Training

Train 30s model

Replace train_30s_model.py with your chosen model

docker run --name gtzan --rm -it --ipc=host --gpus=all -v $PWD:/home huggingface/transformers-pytorch-gpu python3 /home/train_30s_model.py

This command spins up a docker container from the official huggingface image, mounts the repo directory and run the training script

Running

Run the model - from huggingface 🤗

Open the Model in hugging face.

Note that hugging face server supports tracks up to 2-3 minutes

Run the model - using python

On GPU:

docker run --name gtzan --rm -it --ipc=host --gpus=all -v $PWD:/home huggingface/transformers-pytorch-gpu

On CPU:

docker run --name gtzan --rm -it -v $PWD:/home huggingface/transformers-pytorch-gpu

In the container either use a python script file or via the interactive interpreter:

from transformers import pipeline
import torchaudio
import sys
MODEL_NAME = 'adamkatav/wav2vec2_100k_gtzan_30s_model'
SONG_IN_REPO_DIR_PATH = '/home/rolling_stones.wav'

pipe = pipeline(model=MODEL_NAME)
audio_array,sample_freq = torchaudio.load(SONG_IN_REPO_DIR_PATH)
resample = torchaudio.transforms.Resample(orig_freq=sample_freq)
audio_array = audio_array.mean(axis=0).squeeze().numpy()
output = pipe(audio_array)
print(output)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MusicGenreClassification

Technion ECE 046211 - Deep Learning

Background

The Model

Dataset

Agenda

Results

30s model

15s model

10s model

Docker

Training

Train 30s model

Running

Run the model - from huggingface 🤗

Run the model - using python

On GPU:

On CPU:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
img		img
README.md		README.md
eval_model.py		eval_model.py
rolling_stones.wav		rolling_stones.wav
train_10s_model.py		train_10s_model.py
train_15s_model.py		train_15s_model.py
train_30s_model.py		train_30s_model.py

Folders and files

Latest commit

History

Repository files navigation

MusicGenreClassification

Technion ECE 046211 - Deep Learning

Background

The Model

Dataset

Agenda

Results

30s model

15s model

10s model

Docker

Training

Train 30s model

Running

Run the model - from huggingface 🤗

Run the model - using python

On GPU:

On CPU:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages