audio-mnist-with-person-detection

MNIST audio classification project is able to recognise the person speaking along with the digit spoken!

Data set referred: https://github.com/Jakobovski/free-spoken-digit-dataset/tree/master/recordings

Code Description:

data_accumulator.ipynb: This jupyter notebook is responsible for the processing the .wav files located in the directory recordings/. Files are named in the following format: {digitLabel}{speakerName}{index}.wav Example: 7_jackson_32.wav. The project uses librosa to extract MFCC features from a audio clip which are then fed into deep CNN for classification and recognition of the person.

train.ipynb: This jupyter notebook stratigically split the whole dataset from data_accumulator.ipynb into training and testing set. It uses a relatively shallow CNN architecture written with Keras on top of Tensorflow to classify not only the digits spoken correctly but also recognize the person speaking the digit at the same time! Achieved accuracy of 96.25% on the unseen testing data.

Goal:

Goal of the project was to play around with librosa and audio features extraction in general :)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
recordings		recordings
.gitignore		.gitignore
README.md		README.md
data_accumulator.ipynb		data_accumulator.ipynb
digits.csv		digits.csv
features.csv		features.csv
files.csv		files.csv
labels.csv		labels.csv
mfccs.csv		mfccs.csv
model.h5		model.h5
names.csv		names.csv
test.py		test.py
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

audio-mnist-with-person-detection

MNIST audio classification project is able to recognise the person speaking along with the digit spoken!

Code Description:

Goal:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

audio-mnist-with-person-detection

MNIST audio classification project is able to recognise the person speaking along with the digit spoken!

Code Description:

Goal:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages