Build a Custom ASR Model in TensorFlow: A Step-by-Step Tutorial

preview_player
Показать описание
Learn the basics of speech recognition with TensorFlow and build practical applications with this tutorial. Discover the history of speech recognition and the challenges that come with dealing with human speech variability, similar-sounding words, and low-quality audio. Explore the various techniques used in speech recognition, such as machine learning algorithms like deep learning, Hidden Markov Models (HMM), Dynamic Time Warping (DTW), and phonetic-based approaches. Discover how transformers have transformed the field of speech recognition and how they can be used to recognize different languages, understand natural language, and distinguish between similar words. Follow along with the tutorial to build a basic speech recognition model using TensorFlow, combining a 2D convolutional neural network (CNN), recurrent neural network (RNN), and Connectionist Temporal Classification (CTC), and apply this knowledge to develop practical applications.

#machinelearning #python #tensorflow #opencv #ASR
Рекомендации по теме
Комментарии
Автор

A good presentation. Thank you for providing this information.

vkrts
Автор

did you use Mel frequency cepstral coefficients (MFCC) as feature extraction?
if no, what is the feature extraction used?

mariamjbani-amer
Автор

Thank you so much. Can you also provide a video for TCN model? I am struggling to get the result using TCN.

shrijanregmi
Автор

This is so nice. Thank you very much for sharing your knowledge.

omochi
Автор

thank you for efforts, after train and save model how i use to transcript other audio not the one i trained and exist on csv file ? please tell me ? another thing how i know train is good with curves.

space_x
Автор

That's great, thanks for your sharing.
After creating the model, can we use this model with openai whisper ?

tringuyen-ivyf
Автор

thank you for the nice tutorial I think you did it with CTC mode which is sequence to sequence. I want to do the same project by using my dataset by using Listen attend and spell model and there is no any tutorial done on that area can you help me on how to implement it??

GelanaAbdisa
Автор

can i use this for making a model for arabic language ?

pesworld
Автор

Thanks.. Fantastic work.. Please can I run it in my own CPU computer??

mustafaaa
Автор

Will there be a PyTorch version of this tutorial??? It would be great. Thanks for such helpful video.

kishanbangsi
Автор

why you select 1000 as epochs number ?

mariamjbani-amer
Автор

nice explaination but please can you add a method in which user can recognize his own voice by repeating dataset sentences

navyaanzaheen
Автор

when i try your code, on the output folder model I did not get model.onnx file
and when i test .h model i get error message said "model, onnx not found"
can you help me ?

mariamjbani-amer
Автор

I am looking for some resources to learn ASR but I couldnot find good resources so could you please share me some ASR resources. Thank You!

ishanpanta
Автор

Could you please make video on project converting text to speech ?

yashkewlani
Автор

can you provide your pretrained model for use as we cannot train on cpu

RoshanRawat-gv
Автор

why dont you put microphone on your model? i just wonder

melapobia