MambaVision-Music-Genre-Classification

PyTorch implementation of music genre classification using MambaVision architecture

Table of contents

Background

The idea of our approach is to combine the sequential and time-dependent nature of music data with the MambaVision architecture for enhanced music genre classification. We leverage spectrograms as input, which are then processed by the MambaVision model, a lightweight transformer-like architecture tailored for feature extraction and patching. By doing so, we achieve superior results compared to traditional transformers and CNNs, even those pre-trained on different data types. The MambaVision model's ability to effectively handle the unique characteristics of musical spectrograms, coupled with its efficient feature extraction and patching capabilities, leads to significant improvements in classification performance. For detailed insights and theoretical underpinnings, please refer to our complete work.

Dataset

The GTZAN dataset was used. The data set consists of 1000 songs in length of 30[sec] divided to 10 classes

Prerequisites

Library	Version
`Python`	`3.5.5`
`torch`	`2.1.1`
`kornia`	`0.7.3`
`matplotlib`	`3.7.2`
`transformers`	`4.42.3`
`numpy`	`1.23.5`
`h5py`	`3.10.0`
`librosa`	`0.10.2`
`pandas`	`2.1.1`
`seaborn`	`0.13.0`

Files in the repository

File name	Purpsoe
`data_analysis.ipynb`	analysing the Model's results
`genre_predictor.py`	main script for spesific song prediction
`models.py`	contains all the models
`Paras.py`	initialize parameters for the project
`train_models.ipynb`	notebook for training the different models
`train.py`	helper script for training the different models
`Build Dataset.ipynb`	notebook for step by step data prepearing
`data_loader.py`	data loading script
`music_dealer.py`	your own data loading script
`util.py`	utils for data use

Quick start

Clone the repo:

git clone https://github.com/ovedtal1/MambaVision-Genre-Classification.git

Download the free GTZAN dataset form Kaggle: GTZAN
Place the data in the main folder of the repo
Run the 'Build Dataset.ipynb' step by step for dataset creation
Follow the 'train_models.ipynb' for training the different models (MambaVision based, Transformer based & CNN)

Analysing and testing

Analyze your trained models with the 'data_analysis.ipynb' script
Run the 'genre_predictor.ipynb' scripy with you own music and classify it!

Future Work

Compare the MambaVision with more architectures
Search for more custom augmentations
Test the MambaVision architecture on different tasks

References

Ali Hatamizadeh, Jan Kautz MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Lianghui Zhu et al. Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Tri Dao, Albert Gu Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Mathilde Caron et al. Emerging Properties in Self-Supervised Vision Transformers
Pytorch implementation and pre-trained wieght for MambaVision [NVIDIA Research]
Self-Supervised Vision Transformers with DINO - pytorch [Facebook Research]
Yuval Hoffman, Roee Hadar Music-Genre-Classification-using-Transformers project

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
dataset		dataset
logs		logs
results		results
sample_music		sample_music
scripts		scripts
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MambaVision-Music-Genre-Classification

Background

Dataset

Prerequisites

Files in the repository

Quick start

Analysing and testing

Future Work

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MambaVision-Music-Genre-Classification

Background

Dataset

Prerequisites

Files in the repository

Quick start

Analysing and testing

Future Work

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages