How to Implement a CNN for Sound Classification

Показать описание

Learn how to implement a deep learning (CNN) sound classifier using Pytorch and torchaudio.

Code:

===============================

Interested in hiring me as a consultant/freelancer?

Join The Sound Of AI Slack community:

Connect with Valerio on Linkedin:

Follow Valerio on Facebook:

Follow Valerio on Twitter:

===============================

Content:
0:00 Intro
0:31 Implementing CNNNetwork class
9:55 Implementing the forward method
12:43 Network summary with torchsummary
17:19 What's up next?

Рекомендации по теме

Комментарии

I have recently discovered your channel, I have been watching your videos from the very beginning, now I see you are still uploading videos and I feel so excited of all the knowledge you can give to your public. This is so nice! Could you please perform some of your projects to see the evolution of neurons, something like "before and after", what we are capable to do with your courses? Here is pure knowledge and I am so sure of your channel... Thank you very much for your content!

George.English

Thank you so much for amazing tutorials! Just a note (per PyTorch docs)... when using nn.CrossEntropyLoss() as the loss_fn, it is important to keep the model output as raw logits (ie. do not include the softmax() in the model as the final output layer). I read in a discussion that this is important to reduce the potential for numerical instabilities due to some log-sum-exp equation that is performed. This might be new to the current version of PyTorch (2.0.1).

deemo

Why do you multiply 128 by 5 and 4? Where do the latter two numbers come from?

Thanks for all the videos; they're fantastic.

ericdemattos

torchsummary is now torchinfo, if im not mistaken!

michk

8:28 In 128 * 5 * 4, where did the 5 and 4 come from?

peterkanini

Thank you for these videos! I follow step by step but I get an error in the train_single_epoch function, because targets are not tensor objects, but tuple. Why?

11 def train_single_epoch(model, data_loader, loss_fn, optimiser, device):
12 for inputs, targets in data_loader:
---> 13 inputs, targets = inputs.to(device), targets.to(device)
AttributeError: 'tuple' object has no attribute 'to'

saragiovannini

Sir please consider adding the multichannel raw files (like ULA & UCA microphone array) and process the CNNs

amruthgadag

thanks sir for this great video
just one quastion why you didn't normalize the mel freq images before feed them to vggnet as i know the values shuold be between 0 and 1

EngRiadAlmadani

I have checked everything twice, my model is running on cuda, but Accuracy is 'zero' from the beginning, can some one help me?

jaypadia

CNNs are typically used for images. Why are we using CNNs for audio and how does that work?

Dygit

I got an error:

RuntimeError: stft input and window must be on the same device but got self on cpu and window on cuda:0

how can I solve that?

cemayar

Hello, and thanks for your channel. I would like to know if I can use TSFRESH to extract features from sound files. My problem is that I do not know how to do it: I have at my disposal .OGG files (not .WAV files). Librosa can read them without a problem, and I managed to extract the features with it, but I get only 60% of precision in sound recognition with Random Forests and 55% with an ANN built from scratch. I was told that TSFRESH can extract hundreds of features from a time series, and it is true, but I would like to know how to make it work with my sound files in .OGG format

alchimiste

sir can you please upload audio classification using pytorch in Google Colab?

saleemjamali

May you cite other architectures considered "better" for audio classification? Are they always based on image processing (i.e. conv. layers, mel spectrogram)?

luigibcdefg

Thanks so much for the videos! They're super useful. I was wondering if there any significant of implementing a VGG architecture using pytorch vs tensoflow/keras? I'm relatively new to machine learning so hopefully my question makes sense

seewai

Is there a reason why you decided to use a conv2d over a conv1d? or is it a matter of preference?

stevenhoang

Great!
Can you implement a Continuous Speech Recognition app? I'm having a hard time doing it :D

rog

I love your videos. Are there any good tutorials you recommend on building CNNs in python?

bobmeyers

How to Implement a CNN for Sound Classification

Convolutional Neural Networks (CNN) Implementation with Keras - Python

What are Convolutional Neural Networks (CNNs)?

Image classification using CNN (CIFAR10 dataset) | Deep Learning Tutorial 24 (Tensorflow & Pytho...

Convolutional Neural Networks Explained (CNN Visualized)

Build a Deep CNN Image Classifier with ANY Images

Convolutional Neural Network Tutorial (CNN) | How CNN Works | Deep Learning Tutorial | Simplilearn

Convolutional Neural Networks | CNN | Kernel | Stride | Padding | Pooling | Flatten | Formula

Image Classification using CNN Keras | Full implementation

How to Implement a CNN for Sound Classification

PyTorch Tutorial 14 - Convolutional Neural Network (CNN)

What is a convolutional neural network (CNN)?

TensorFlow Tutorial 05 - Convolutional Neural Network (CNN)

What is a Convolutional Neural Network (CNN)?

Tutorial 27- Create CNN Model and Optimize using Keras Tuner- Deep Learning

Day 5-Understanding CNN &Impementation| Live Deep Learning Community Session

16- How to Implement a CNN for Music Genre Classification

Tutorial 21- What is Convolution operation in CNN?

Train Neural Network by loading your images |TensorFlow, CNN, Keras tutorial

Cat Vs Dog Image Classification Project | Deep Learning Project | CNN Project

What is CNN in deep learning? Convolutional Neural Network Explained

CNN: Convolutional Neural Networks Explained - Computerphile

Pytorch CNN example (Convolutional Neural Network)

Padding, Strides and Channels in CNN

Train Mask R-CNN for Image Segmentation (online free gpu)