How to Extract Spectrograms from Audio with Python

preview_player
Показать описание
Learn how to extract spectrograms from an audio file with Python and Librosa using the Short-Time Fourier Transform. Learn different types of spectrograms and compare the spectrograms of music in different genres.

Code:

Join The Sound Of AI Slack community:

Interested in hiring me as a consultant/freelancer?

Follow Valerio on Facebook:

Connect with Valerio on Linkedin:

Follow Valerio on Twitter:
Рекомендации по теме
Комментарии
Автор

You are going to put so many lecturers - top universities inclusive, out of job!

angdang
Автор

Such a comprehensive yet easy to understand series, hats off

nmirza
Автор

The best explanation in all the internet! Thanks man.

faramirchevlonski
Автор

MAKASIIH BG, SUDAH MEMBANTU TUGAS SAYA

ryandaputra
Автор

Extremely clear explanation!
Thanks a lot!

ourissueanniversary
Автор

This helps so much for my final project idea thank you!

saucyyy
Автор

Thank you so much for your fascinating course.
At about 7:33, when you explain how to get #frames, which is 342 here, I cannot calculate it myself based on the formula in the last video:
#frames = ((#samples(of scale = ((174943-2048)/512)+1 = 338.68 and not 342.
Can you please clarify me?
Thank you
Zahra

zahraroozbehi
Автор

I notice in the librosa docs they don't square the magnitude but then use amplitude to db, does anyone know if this make any difference to the final results? Guess I'll have to try to understand everything properly later! Great video!

sharonmcgavin
Автор

Is there any possibility to recreate audio after we get the log-amplitude spectrogram? Ofc, first we would convert dB back to power for what we have a function in Librosa, but what then? How to invert the "np.abs(S_scale) ** 2" part back to audio?

rekreator
Автор

Can you please explain similarity matrix if its possible with python?

Pianistprogrammer
Автор

Hi, great video again. Could anyone explain me why we square after the np.abs(Y)? not using it doesn't change much the result since we'll use logarithmic scales, but is it correct to actually use it?

Underscore_
Автор

thanks for your video, very helpfull and well explained

lahcenekabour
Автор

Why the time of the spectrum is only lasting a few seconds while the raw audio is a few hours long?

enkaibi
Автор

Why does the debussy sound file does'nt look like a copy from the center?

YashGanodiya
Автор

Hi how do we do this if we have 150 audio files?

shubhamkapoor
Автор

Quick questions, What is the purpose of doing Y_scale = S_scale ** 2 ? Why 2 but not a different number ? What effect does this power parameter has on the generated spectrogram ?

cloudhuang
Автор

Sir is it also possible to save such a spectogram to an image file?

Trivimania
Автор

Hi Valerio, in the above spectrograms there is always a strong and constant low frequency component. What does it depend on? Is it relevant or is it just an artefact? Thank you

riobale
Автор

Hey Valerio, i tried to make it on an audiofile that i got but i have an error; it sais to me that this "figsize" is not defined... can you please help :)

habibrekik
Автор

Was there a reason why you changed from using a continuous colour scale in the first (non-log) plot to a diverging scale for the log plot?

desryan
visit shbcf.ru