OpenAI Whisper - MultiLingual AI Speech Recognition Live App Tutorial

preview_player
Показать описание
OpenAI Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. Whisper works with multiple low resource languages including Tamil, Hindi, Telugu, Malayalam and more.

Рекомендации по теме
Комментарии
Автор

Golden Content! Just started working on a project and this is a very helpful resource to implement. Thank you!

chaithanyavamshi
Автор

Hi and thank you! I find your content so inspiring! Definetly trying this app.

fedahumada
Автор

Love the channel, you should have many more subs! ❤

concretecw
Автор

Best content ! Thanks
Can we calculate confidence interval of each word transcribed?

TejasNarola-utci
Автор

Kudos to you if you prepared the Colab files!

byGDur
Автор

To run OpenAI Whisper LARGE model, how does the RTX 4090 compare to this setup on AWS - NVIDIA A10G Tensor Core GPU, g5.xlarge with 16GB RAM. Can I expect faster or slower transcription with the 4090?

georgepatronus
Автор

Bro really amazing content hatsoff to you

gowthamdora
Автор

May I ask, once the web demo is done with basic UI web using Gradio, how can we migrate this to a proper web app, like standalone webapp, can you please guide a little ?

appstuff
Автор

Hey Bro awsome, what accuracy does this STT has for tinglish, tamil + english ?

ChetanGJ
Автор

This is a great demo, thank you!

I am new to programming. Can our local machines handle this or should we do it in google collab?

flawedthoughts
Автор

can it do realtime transcription instead of processing audio file ?

abhignaconscience
Автор

Can the RTX 4090 run Openai Whisper LARGE model well, on an i9 1TB Nvme SSD 12th Gen gig that has 64GB DDR5 RAM?

georgepatronus
Автор

Hello dear, video is really very helpful for me. I am trying to build asr for Sanskrit language. It is not working for that. Could you help me how to train sanskrit data? Or any videos that will help me for building sanskrit asr. I have a parallel sanskrit data.

tapanray
Автор

Hello, thank you so much for your tutorial. I am trying to use Whisper for my master's thesis in translation technologies. The only issue I had was that after importing gradio and recording live a short audio so Whisper can transcribe, it doesn't work, it just keeps loading and loading forever even if it's just a 6 second audio. What do you suggest I can do? Thank you again from Spain!

annaacedoortega
Автор

I would recommend Streamlit to build front-end interface.

raydenx
Автор

What happens if the audio clip is longer than 30 seconds???

antonkal
Автор

how can you have it process a multilingual audio?

laylabitar
Автор

! pip install gradio -q

this code shows me error
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spacy 3.7.4 requires typer<0.10.0, >=0.3.0, but you have typer 0.12.3 which is incompatible.
weasel 0.3.4 requires typer<0.10.0, >=0.3.0, but you have typer 0.12.3 which is incompatible.

now what to do please reply fast

ufzomqk
Автор

Very cool. Can we use openai whisper to IVR telephony. Like it needs to address clients from multiple languages like Hindi, telugu, Malayalam, tamil, English and respond accordingly

GeorgeMathew
Автор

Thank you for the tutorial.

When I tried to step through your Gradio app, I got errors when trying to import your audio clips.
When I disconnected and copied your code to my own Google Drive, I was able to at least record audio with my own microphone and see Whisper transcribe up to 30 seconds.

chrontexto