13- Implementing a neural network for music genre classification

preview_player
Показать описание
In this video, l implement a music genre classifier using Tensorflow. The classifier is trained on MFCC features extracted from the music Marsyas dataset. While building the network, I also introduce a few fundamental deep learning concepts such as binary/multicalss classification, rectified linear units, batching, and overfitting.

Video slides:

Code:

Interested in hiring me as a consultant/freelancer?

Join The Sound Of AI Slack community:

Follow Valerio on Facebook:

Valerio's Linkedin:

Valerio's Twitter:
Рекомендации по теме
Комментарии
Автор

7:19 That's brilliant. It's one thing to be good at memorizing and API but you're also a genius! This is what makes a good programmer. Super excited to learn more from you, this is what should be in YouTube trending!

geofox
Автор

Very nice illustration Valerio, specifically at the end where you showed the overfitting.

mostafahasanian
Автор

Perfect recap after DLS Course by Andrew Ng. Your videos are so awesome as Coursera's. Thank you!

chipotle
Автор

Great elegant coding and clear instruction! One potential issue here is the data leakage problem. Since each mfcc array is generated from 1 out of 5 of total segments you split a tack into. This means that naive usage of train_test_split will result in a situation that test set and train set share different segments of the same track and test accuracy will be overestimated during model development phase. Better split of data should be done on a track level but not segment level.

spkt
Автор

Super amazing, full pack playlist for acoustic deep learning! Thanks alot Valerio for this!

raghavrawat
Автор

Thanks a Ton! Your videos are really instructive Dr. Velardo. Looking forward to more videos/Lectures from you.

jainrohit
Автор

You are love man. Can't w8 for more video on deep learning music

smilebig
Автор

i watched both video 12 and 13 and they worked. So my question is how can i see classification of songs? i could not see here, which of your videos shows the result of classification

berkinoztekin
Автор

Hey dude,
congratulations for your class/share link.
We keeping walk. Chers! :)

i_am-ki_m
Автор

What is the error formula when you input the batch or all inputs? Thank you.

nhactrutinh
Автор

I replaced sigmoid with ReLU in the simple MLP network (covered in video # 9) & the predictions started coming to be 0 in many cases. Not sure what is causing this - but do we see this being an issue with audio data - MFCCs can have -ve values and that may make h -ve.

maulikdave
Автор

Awesome contents. Thank you and btw did your machine have GPU when training in this video?

ngocminhphung
Автор

Amazing, it trains just 127 samples on my mac, but trains the complete samples on a windows machine. Help would be highly appreciated

shams_ad
Автор

Hi...if we are loading one data point and then performing forward pass and back pass how can this be faster? Essentially we are sequentially reading data in RAM . Loading data in RAM in larger batch should be faster. Am I missing something..

physicsmadness
Автор

I suppose I can't load the whole json file.

When I print(inputs.shape) The output is (200, 130, 13) Because there is 2 files in each folder of my reduced dataset.

When I tried the same fot the whole dataset, with 100 tracks in each folder, The output is (10000, ) I was expecting to see it be something like (10000, 130, 13) .

So, am I right? Is my data not complete? Because the size of my data.json file is 647MB? If this is the case? How did Valerio's code read such a big file without error?

Thanks for any help in advance.

SabriCanOkyay
Автор

Why when we, theoretically, use a full batch, we need just one epoch?

amitbenhur
Автор

Hello. I am using spectral centroid for audio classification. But I have one problem. When I use MLP for classification my val accuracy is constant from epoch 1. What is going on?

bujipaji
Автор

i have question, i previous you said we will have 5 segments like we are dividing so 30 sec video divided into 5 segments = 5 segments with 6 secs of data and here you are saying u made 10 segments

muntazirmehdi
Автор

Hello Valerio, thank you so much for your videos. I can't load all the 6997 samples when I trained, I only have 219 samples. How could I solve it??

hoangphuc
Автор

sir how can do speaker change detection for diarization

Liya
visit shbcf.ru