SUPER Fast AI Real Time Speech to Text Transcribtion - Faster Whisper / Python

preview_player
Показать описание
SUPER Fast AI Real Time Voice to Text Transcribtion - Faster Whisper / Python

👊 Become a member and get access to GitHub:

Get a FREE 45+ ChatGPT Prompts PDF here:
📧 Join the newsletter:

🌐 My website:

Faster-Whisperer:

I created a almost zero latency real time AI voice to text transcribtion using faster whisperer and python. We are gonna look at some use cases for the script and a preview of my upcoming video. Enjoy!

00:00 Intro
00:21 Real Time AI Transcribtion "Mr.Beast"
01:25 Setup / Python Code
03:33 Real Time AI Transcribtion "Sentiment Analysis"
05:51 Real Time AI Transcribtion "Secret Project"
08:14 Conclusion
Рекомендации по теме
Комментарии
Автор

Epic! - These videos are some of the best stuff on YouTube - love the idea with the image generation at the end

OliNorwell
Автор

Tips: You can transform your device's audio output into a "microphone" on Windows, so you don't need to place your headphones over your microphone.

1. Press Windows key + R -> type "mmsys.cpl"
2. In the Recording tab, enable the Stereo Mix option. Now, "Stereo Mix" is an available microphone option! You can select it as the audio input.

bim-techs
Автор

Pulling in people with a flashy thumbnail of a Python code that works and then trying to monetize your code based on a library that is already supposed to be open source is in my opinion bs. it is not fair for beginners that might not know Python or whisper very well. for that I give you a thumbs down!

filipphenderson
Автор

This is amazing and inspiring. I love the ending of the video and can’t wait for Wednesday. As a dyslexic person I think you unlocked a new use case for learning.

theraybae
Автор

5:51 Neutral = I'm gonna go troll now. Funny stuff, great video! Thanks

jaujud
Автор

There is a product for Live video Transcription there. Live text services are expensive and does not work on many current languages.. Set up a server/service that will ingest a RTMP video source, delay the video and overlay text on video in perfect sync. then offer RTMP output with burned in Live text. :) There is need for this service.

ReadyMedia-no
Автор

Good to see transcription and generate responses as audio in real-time for phone call

benscottbongiben
Автор

Fantastic !!! A bit fast in explaining and showing, but I can always pause!

ArmandoMenicacci
Автор

Hey man this is really cool! I'd like to know if you:
1) used the whisper v3 model? or the v2?
2) If you have seen the demos from gpt4, they also showed that gpt ASR is better than whisper v3, wonder if it will be open like whisper.

ferluisch
Автор

Amazing and inspiring work! Kris what about something less powerful but better accessible in terms of hardware?

HammerOnTheNet
Автор

I have tried to get this to run on M1 MacBook. No joy. The CPU maxes out even with the tiny model. But then I tried with the Whisper.cpp implementation which is compiled for apple silicon. I found a whisper-cpp-python wrapper for that library. That actually runs and is far less CPU bound. It has a bit of a stutter, it is not as clean, it misses words between the chunk processing but you can see that with just a little bit more power it could work.

svenborgers
Автор

wow !! great video !!! Thank you for being so generous and teaching this to us, this is epic stuff! I can already start see all kinds of use cases, I cant wait to get it running, I'm really looking forward to Wednesday's video . Thanks again from Canada

ryanjames
Автор

Interesting stuff on the image creation at the end while talking, not sure if you are taking into consideration puctuation in you sentences? Im pretty sure this would have to do with something cool, maby keeping an overview of all the text that has been moving out of the "buffer" for style ? Looks like something I could have a lot of fun with, do not have the GPU though :/ Colab however.

kimsteinhaug
Автор

Nice video!! thanks for your help in this topics!!

cristobalmunoz
Автор

Excellent! Thank you so much for sharing!

radudamianov
Автор

Hello and great to see this kind of contents.

I actually have a question about speech to text in another language and for example Swedish.. and passing it throw llama for correction, .. maybe for a meeting conference or something like that .. what do you suggest ?

JohannaKarlsson
Автор

I have been looking where to start, fantastic work, where can I have the code for testing

hjoseph
Автор

Thanks for sharing your knowledge/experience.
I'm bit perplexed. The description here mentions 45+ prompts in the PDF book, the newsletter website says 40+, and the PDF doc says 35+. Which number is correct?

t-dsai
Автор

This will be a good tool for language immersion chinese / japanese / indonesian along with the deepl clipboard tool, edge browsers tts engine.

aoeu
Автор

thanks this is great! Where can I find the actual code you have on your screen? Struggling to find it on the github

magnoliasphinkter