Audio to Text Converter in Python Tutorial with OpenAI Whisper from Hugging Face Pipeline

preview_player
Показать описание
In this Python Applied Machine Learning Tutorial, We will learn how to use OpenAI Whisper from Hugging Face Transformers Pipeline for state-of-the-art Audio-to-Text.

Just in 3 lines of Python code, We'll build Audio to Text Convertor in Python. This uses OpenAI Whisper for automatic transcription audio to text.

Рекомендации по теме
Комментарии
Автор

Thank you for taking the time to make this. A really good explanation and tutorial.

timwebster
Автор

Thanks bro, great job as always, inspired by your passion to be uptodate

abhishekchintagunta
Автор

Favorite ml channel and amazing English thank you!

fractalarbitrage
Автор

Can you please tell me what two things it downloads each time i am trying to run my code in my local system.
I have already installed library but still i am confused, what it downloads, so as to avoid downloading every time to make the run faster

sanskari_nomad
Автор

hi, i just want to ask is it possible to have speaker identification and channel identification with whisperai?

enggm.alimirzashortclipswh
Автор

Bro, I want to give input through my microphone instead of existing audio file. how can I do it ?

maheshkumar
Автор

can hugging face be used without adding cart information? is there any models in hugging face can be used for free amount? is there an endpoints can be used directly (without python) in nocode platforms ?

salemmohammad
Автор

I tried the huggingface pipeline for ASR uisng whisper, but it did not convert the entire Audio to text. The video is abt 10 minutes duration. But when i directly used whisper package, it could transcribe the entire audio to text. Also tried specifying max_tokens/max_length to 2000 but it did not workout. Could you please advise on this

nayakdonkey
Автор

I just learned about Colab, and was thinking to build an app for my own convenience, so I can use it on my phone. Is there like a way to have this on API, with OAuth of sorts ? or do I have to run this on my own hardware, this is just personal use, so very little traffic is expected. If I need to run it on my own infrastructure, do you have an idea what kind of hardware is needed ?

Thanks in advance!

nemtii_
Автор

hey bro can you please tell me how i can use the real time voice translation by using Speech_recognition

ylibizw
Автор

can i make a screenplay format out of this ?

gentle-wellness-guidance
Автор

Did you figure out the way to work with other language more accurately like Bengali, Hindi, Punjabi have a lot of errors, did you use custom models for those or did you find a way to custom train it ?

anubhavs
Автор

Wasn't there a speech-recognition company a few years back Dragon Naturally Speaking? I wonder: what will happen to their business model now....

contrarian
Автор

Is it possible to load a file from my local machine, instead of downloading it from the web?

novaegregora
Автор

Can you make the same task for telugu language to english text.

thangellakumar
Автор

Can you find please the best text to speech model? That seems to be even harder to find...that's actually good.

encapsulatio
Автор

AssertionError: Torch not compiled with CUDA enabled

blagodaren
Автор

What if i want timestamp for each word? 🔴🔴

alihusham
Автор

i have no idea what you talking about. i have zero knowledge about pyton or kind of coding tools. i just want to transcript a youtube video to text as easy as possible for learning new things. can you help me to solve my problem?

johanaim
visit shbcf.ru