Loading Data - Deep Learning for Audio Classification p.2

preview_player
Показать описание
Data can be found at this Github repo:

Kaggle competition where data is taken from:

Рекомендации по теме
Комментарии
Автор

I have re-worked this series with the full code on github. If you want to follow along from these videos you still can. After cloning your repo:

git checkout

Will revert to the most recent commit before I overhauled everything.

seth
Автор

because the wav file are in their corresponding folder I think the code needed to be adjusted as follow:


df.set_index('filename', inplace=True)

for f in df.index:
label = df.at[f, 'label'] # Get the label for the file
filepath = f'wavfiles/{label}/{f}' # Construct the file path
rate, signal = wavfile.read(filepath) # Read the WAV file
df.at[f, 'length'] = signal.shape[0] / rate # Calculate and set the length

shafagh_projects
Автор

One million thanks for creating such an interesting and very well explained content!!!

wuzark
Автор

The background low frequency bass sound is really distracting in this video maybe do a high pass filter on your audio. The issue happens when you are typing. However, great tutorial so far.

droy
Автор

You can do librosa.load(file, sr=None) to return the original sampling rate

Niculwmusic
Автор

Thank you. Another great video and really practical with the files and all. 👏🏻🏅🔥

KordTaylor
Автор

under the iteration of classes, line 70:
instead of using librosa to get signals can't we use wavefile.read().
I notice the difference, however, what would be the difference if we:
Instead of : librosa.load('wavfiles/'+wav_file, sr=44100)
we use: np.array([x/rate for x in

jugsma
Автор

thank you, awesome explanation <3
I am your fan now!!!

jeremynx
Автор

When I try to load my particular dataset with librosa, I get the error that I can only concatenate str (not "numpy.int64") to str. I've checked that the way I've defined wav_file is a string, but don't understand how the folder name wouldn't be a string. Any ideas here? I have a folder in .venv called "All WAV files". Here is my code:

signal, rate = librosa.load('All WAV files/'+wav_file, sr=1000)

MathStatsMe
Автор

Hey, great video! where can i find the .csv file?

lucasgrassoramos
Автор

hi, i'm having an issue with this line "signal, rate = files\audio_train/' +wav_file, sr=44100 )", and this is the error that pops up when i execute it
RuntimeError Traceback (most recent call last)
in load(path, sr, mono, offset, duration, dtype, res_type)
128 try:
--> 129 with sf.SoundFile(path) as sf_desc:
130 sr_native = sf_desc.samplerate

in __init__(self, file, mode, samplerate, channels, subtype, endian, format, closefd)
628 format, subtype, endian)
--> 629 self._file = self._open(file, mode_int, closefd)
630 if set(mode).issuperset('r+') and self.seekable():

in _open(self, file, mode_int, closefd)
1183 _error_check(_snd.sf_error(file_ptr),
-> 1184 "Error opening {0!r}: ".format(self.name))
1185 if mode_int == _snd.SFM_WRITE:

in _error_check(err, prefix)
1356 err_str = _snd.sf_error_number(err)
-> 1357 raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
1358

RuntimeError: Error opening files\\audio_train/Writing': System error.

During handling of the above exception, another exception occurred:

FileNotFoundError Traceback (most recent call last)
in <module>
----> 1 signal, rate = files\audio_train/' +wav_file, sr=44100 )

in load(path, sr, mono, offset, duration, dtype, res_type)
145 if isinstance(path, six.string_types):
146 warnings.warn('PySoundFile failed. Trying audioread instead.')
--> 147 y, sr_native = __audioread_load(path, offset, duration, dtype)
148 else:
149 six.reraise(*sys.exc_info())

in __audioread_load(path, offset, duration, dtype)
169
170 y = []
--> 171 with audioread.audio_open(path) as input_file:
172 sr_native = input_file.samplerate
173 n_channels = input_file.channels

in audio_open(path, backends)
109 for BackendClass in backends:
110 try:
--> 111 return BackendClass(path)
112 except DecodeError:
113 pass

in __init__(self, filename)
60 """
61 def __init__(self, filename):
---> 62 self._fh = open(filename, 'rb')
63
64 try:

FileNotFoundError: [Errno 2] No such file or directory: files\\audio_train/Writing'

RuntimeError Traceback (most recent call last)
in load(path, sr, mono, offset, duration, dtype, res_type)
128 try:
--> 129 with sf.SoundFile(path) as sf_desc:
130 sr_native = sf_desc.samplerate

in __init__(self, file, mode, samplerate, channels, subtype, endian, format, closefd)
628 format, subtype, endian)
--> 629 self._file = self._open(file, mode_int, closefd)
630 if set(mode).issuperset('r+') and self.seekable():

in _open(self, file, mode_int, closefd)
1183 _error_check(_snd.sf_error(file_ptr),
-> 1184 "Error opening {0!r}: ".format(self.name))
1185 if mode_int == _snd.SFM_WRITE:

in _error_check(err, prefix)
1356 err_str = _snd.sf_error_number(err)
-> 1357 raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
1358

RuntimeError: Error opening files\\audio_train/Writing': System error.

During handling of the above exception, another exception occurred:

FileNotFoundError Traceback (most recent call last)
in <module>
----> 1 signal, rate = files\audio_train/' +wav_file, sr=44100 )

in load(path, sr, mono, offset, duration, dtype, res_type)
145 if isinstance(path, six.string_types):
146 warnings.warn('PySoundFile failed. Trying audioread instead.')
--> 147 y, sr_native = __audioread_load(path, offset, duration, dtype)
148 else:
149 six.reraise(*sys.exc_info())

in __audioread_load(path, offset, duration, dtype)
169
170 y = []
--> 171 with audioread.audio_open(path) as input_file:
172 sr_native = input_file.samplerate
173 n_channels = input_file.channels

in audio_open(path, backends)
109 for BackendClass in backends:
110 try:
--> 111 return BackendClass(path)
112 except DecodeError:
113 pass

in __init__(self, filename)
60 """
61 def __init__(self, filename):
---> 62 self._fh = open(filename, 'rb')
63
64 try:

FileNotFoundError: [Errno 2] No such file or directory: files\\audio_train/Writing'

please help

bouchrabouchra
Автор

when I run your code with reading length of audio file, I got NaN values for several labels. I think there is something wrong. please help

jeremynx
Автор

Hello, u know how i can solve the problem that my audiodata isn't in 16 bits?

oscarzuniga
Автор

Hey can u suggest audio filtering techniques
Like I have two files one is noise file
Second is noise + signal
How to remove this

rekhapandey
Автор

Hi Seth.. loving ur this series.. thumbs up... i have a question as a newbie ... my data set consists of audio file of 17second each file and about 500 files are in my data set of 3 categories. Is this length i.e. 17 second is normal for creating spectrogram or mfcc?? all files contains full audio from beginning to end. cant perform any cleaning operation on them. TIA.

SHADABALAM
Автор

Hello!
Thank you for your videos, they're really helpful!

Could you please explain why you don't use librosa library functions to calculate mel_spectr? Is it not optimized or something else? I don't know what to chose in my project.

zhannafedorova
Автор

Thank you so much, I'm using this for my science project, and would like to ask if this tutorial would be effective to detect notes?

upsbmussy
Автор

Man, I've really liked your video, thank you very much!

lmor
Автор

Hey Seth.i'm a bit of a beginner. loving these videos, well done. Having an issue at df.set_index('fname', inplace=True) . This error pops up -raise KeyError("None of {} are in the columns".format(missing))
KeyError: "None of ['fname'] are in the columns" Cant seem to find a solution. Kindly help me out.

renarrow
Автор

Hello I want to ask where I can get the file of instrument.csv because I already check from you latest video and github there's no such as file instrument.csv. Thank you

ahmaddaffa
join shbcf.ru