Pre-processing Audio for Deep Learning on GPU

preview_player
Показать описание
Learn how to preprocess audio data directly on GPU using Pytorch and torchaudio.

Code:

===============================

Interested in hiring me as a consultant/freelancer?

Join The Sound Of AI Slack community:

Connect with Valerio on Linkedin:

Follow Valerio on Facebook:

Follow Valerio on Twitter:

===============================

Content:
0:00 Intro
0:33 Selecting a device
1:36 Updating constructor with device
2:07 Registering transformation + tensor with device
4:05 Running the script
5:01 What's up next?
Рекомендации по теме
Комментарии
Автор

Hello, there is a bug in your implementation,

you must pass your resampler object to device too. Otherwise you will get a runtime error when resampling :)

Thanks for the video mate.

ricardoguevara
Автор

Thank you so much for the video, can you provide a data preprocessing beat track with a custom dataset?
Input - A song
Ground truth - Librosa beats

Sutirtha
Автор

Thanks a lot Valerio for amazing series.
The problem can be solved as :
resampler = torchaudio.transforms.Resample(sr,

egermenful
Автор

def _resample_if_necessary(self, signal, sr):
if sr != self.target_sample_rate:
if device == "cuda":
resampler = torchaudio.transforms.Resample(sr,
else:
resampler = torchaudio.transforms.Resample(sr, self.target_sample_rate)
signal = resampler(signal)
return signal

Fixes the mismatch error for me in Python 3.9

bobdoncom
Автор

How can I know whether I have GPUs capable of running deep learning tasks? I have AMD Radeon GPU. Is that not for deep learning ?

BruinChang
Автор

Hey Valerio! Thanks a lot for this video series. I get an error when I run this program. The error occurs in the Resampler function. RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

vaidunakash
Автор

You are running well, but how can I modify it

黄传宝-fx
Автор

Thank you so much for all your hard work! I'm learning a lot watching these videos <3

Bloom_HD
Автор

Great video! Can we get a tutorial on Speech recognition using pytorch and torchaudio? it would be really great!

rog
Автор

Thank you very much!! You really helped me!! But I am having a problem, I can't convert the mel spectogram back to signal and then save it as audio, because I am developing a basic audio GAN, and I want to listen the result, I checked if have something to reverse the transformation inside torchaudio.transforms, but I found nothing. I already tried using librosa:

"""
for song, sample_rate in train_loader:
librosa.feature.inverse.mel_to_stft(song[0][0].numpy(), n_fft=1024)
song = torch.from_numpy(song)
song = song.reshape(1, -1)

torchaudio.save('./data/test_audio.wav', song, SAMPLE_RATE)
"""

But didn't work :(
Am I doing right using mel spectograms in GAN model?
Is there a way to convert mel spectogram back to signal (waveform)?
Could you help me please :( ?
Sorry for this long question

SamtapesGamer
Автор

Hi Valerio, thank you for your video.
When I use cuda same as the way you did I get this error, How should I solve it?
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

alinsr
Автор

Thank you so much for your videos!Unfortunately, I found something interesting, it cost me nearly 300 seconds to process over the whole dataset with my CPU(i7-8550U) but only 150 seconds with my GPU(MX130), which means the speed of my CPU is twice than that of my GPU.hahahaha, maybe I should consider buying a new laptop that has a more powerful GPU like GTX 3060...

ramboking
Автор

Great stuff! Are you planning to do speech recognition course in pytorch?!

theartificialguy
Автор

First, thanks for such awesome playlist. but i dont think moving to gpu at this position is a right decision.
Data should be loaded by cpu. We can use pytorch data loader and define num_workers, which will tell cpu, ok you have to load this num of data in parallel. This will remove burden from gpu and gpu will only be used in model training not for data loading. Loading with cpu also make sure that gpu dont have to wait that the data is coming, cpu will load data in advance for gpu to train.

tictac
Автор

Hi Valerio, thanks for the great video, quite helpful. I wanna check if the resampler should also be assigned to device for consistency ? If i dont do so, error occurs

yitongjin
Автор

Hey VV, are you thinking about creating a series on transformer networks in near future?

riteshpudasaini
Автор

Hi Valerio! thanks for this great video series :) I have a question regarding this: what is the difference between pinning the memory in the DataLoader class instance instead of assigning the data to a specific device inside the Dataset class? Thanks!

antonelse
Автор

Hi, your very talented. It would mean a lot to me if you could filter this scene so I can hear everything the man in the security booth says which is currently unintelligible on the megaphone and the radio, as well as people in the background who shout things. @

bonnienasiry