Audio Classification with Machine Learning (EuroPython 2019)

preview_player
Показать описание
Practical introduction to Audio Classification using Deep Learning. Example shown for Environmental Sound Classification task on the Urbansound8k dataset.

Presented at EuroPython 2019 in Basel. Recording made by the EuroPython team.
Рекомендации по теме
Комментарии
Автор

Hi Jon. Great presentation. I am absolutely new to machine learning and found your talk really clear and useful. Thanks for sharing.

jsbisht_
Автор

perfect bro. can you exchange an idea how to prepare dataset ?

GadisaGemechu-ju
Автор

Hi Jon, I am doing a final year undergraduate project on bioacoustics, I am new to signal processing as well as your channel! I was just wondering - do you have a paper covering some of the stuff you've talked about, which I could reference?

Captura
Автор

Thank you. A very good presentation. Is Keras model code you showed (i.e. "block_1", "block_2", etc.) on a couple of your slides available in one of your GitHub repositories?

michaelwirtzfeld
Автор

Hey Mr Hope you doing good !
Please Can you help me ? How Can we use speech recognition to detect falling in elderly people ?
Just another question how to combine audio with image to implement fall detection ??
Thank you

sidalibourenane
Автор

Interesting talk! In the example you showed, lots of the sounds are quite different from each-other, e.g. the children playing, a siren, and a jackhammer. Does it also work for sounds that are very similar? For example different crow calls or different type of chimpanzee sounds?

weirjwerijrweurhuewhr
Автор

Hi... great work! Thank you for uploading this video. If you had the exact frequency vs time data for a particular sample in text or csv format, How to use it to improve accuracy of a cnn? Can image data be correlated to corresponding frequency data to get more accurate predictions?

jayshaligram
Автор

I'm new to machine learning and I feel like I watched so many audio machine learning videos and the tips & tricks section to the end on this is the most practical and unique stuff I've seen. Thanks! Does the simple audio recognition by tensor flow tutorial still exist? I can't seem to find it? Also, in the audio augmentation slide you talk about adding noise to your data for benefit of the model but in the Q&A you talk about how de-noising is helpful. Could you clarify the different cases where you use both?

peterm.
Автор

Thank you very much for your very informative presentation. However, I have a question regarding one of your slides, Specifically on Aggregation analysis windows: Could you please explain further (possibly with an example). For instance windows = 6 is number of segment that you have extracted from you audio signals or it is length of windows (6*sampling_rate)? or bands=32?  

Moreover, regarding base model, is the model that you presented in slide before (3 layers CNN?) so the logic is that we kind get the audio signals convert them into the sequence of windows and pass them through SB-CNN and propagate it over time and compute the average pooling and will use the output of average pooling to the softmax to conduct the prediction. is this logic is correct?

In advance thank you for you considerations.

sadeghmohammadi
Автор

Great stuff.

How's the job market for this type of knowledge and skills? I am an old EE just starting a DS masters and I've turned my attention to audio classification.

chacmool
Автор

Hello Jon, you did a great presentation. Thanks for sharing.

I am working on my master's thesis, specifically in Lung Sounds classification using CNN.

I am using mfcc's features. I am getting about 88% of accuracy.

Do you think that melspectogram can give a high accuracy than 88% ?

idrisseahamadiabdallah
Автор

Thank you so much for sharing the presentation with us! I m new in machine learning and I have some questions. From where could I download or use datasets of audio for my project? Thank you in advance !

cookingcriss
Автор

Respected Sir...
My project is to cancel the noise from audio... For this how can i train ML model? And how can i proceed for that plz help me....

asirmotivationdoses
Автор

I really like your presentation. Thank you very much. Since I'm trying to classify sound for my project now, could I ask you some more questions?

tranthanh
Автор

thanks you, for great presentation. i have question :
how to make comparisons between one person's voice and another.

sigitpriyohartanto
Автор

I was quite surprised that for classification you didn't feed the feature embeddings of the windows to an rnn and instead just used a post processing trick. Wouldn't an rnn work better, what about a transformer? Also, I know that mel spectrograms work better than just feeding raw audio, but how better? is it like +5% accuracy or is it game changing?

nvm 😅 both of these questions were answered at the end. another question that came to mind though is: what about speech recognition models or something similar, are spectrogram-based models still dominating or is it a different story?

xXDarQXx
Автор

i am here again, one question. Why don't you upload audio processing videos weekly ? Thanks

doyourealise
Автор

Hi can you please explain how can we convert mp3 audio file into. Wav file

Woofawoof_wwooaaf
Автор

Fantastic!!!! **O** GrEAT insight! Thank you!

tommygun
Автор

Sir can you share the code of your model?

saleemjamali