Skip to content

ovedtal1/MambaVision-Genre-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MambaVision-Music-Genre-Classification

PyTorch implementation of music genre classification using MambaVision architecture

Tal OvedDean Efraim

LinkedinLinkedin

Video: YouTube

Background

The idea of our approach is to combine the sequential and time-dependent nature of music data with the MambaVision architecture for enhanced music genre classification. We leverage spectrograms as input, which are then processed by the MambaVision model, a lightweight transformer-like architecture tailored for feature extraction and patching. By doing so, we achieve superior results compared to traditional transformers and CNNs, even those pre-trained on different data types. The MambaVision model's ability to effectively handle the unique characteristics of musical spectrograms, coupled with its efficient feature extraction and patching capabilities, leads to significant improvements in classification performance. For detailed insights and theoretical underpinnings, please refer to our complete work.

Dataset

The GTZAN dataset was used. The data set consists of 1000 songs in length of 30[sec] divided to 10 classes

Prerequisites

Library Version
Python 3.5.5
torch 2.1.1
kornia 0.7.3
matplotlib 3.7.2
transformers 4.42.3
numpy 1.23.5
h5py 3.10.0
librosa 0.10.2
pandas 2.1.1
seaborn 0.13.0

Files in the repository

File name Purpsoe
data_analysis.ipynb analysing the Model's results
genre_predictor.py main script for spesific song prediction
models.py contains all the models
Paras.py initialize parameters for the project
train_models.ipynb notebook for training the different models
train.py helper script for training the different models
Build Dataset.ipynb notebook for step by step data prepearing
data_loader.py data loading script
music_dealer.py your own data loading script
util.py utils for data use

Quick start

  • Clone the repo:
git clone https://github.com/ovedtal1/MambaVision-Genre-Classification.git
  • Download the free GTZAN dataset form Kaggle: GTZAN
  • Place the data in the main folder of the repo
  • Run the 'Build Dataset.ipynb' step by step for dataset creation
  • Follow the 'train_models.ipynb' for training the different models (MambaVision based, Transformer based & CNN)

Analysing and testing

  • Analyze your trained models with the 'data_analysis.ipynb' script
  • Run the 'genre_predictor.ipynb' scripy with you own music and classify it!

Future Work

  • Compare the MambaVision with more architectures
  • Search for more custom augmentations
  • Test the MambaVision architecture on different tasks

References

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors