Best Free Speech-To-Text APIs and Open Source Libraries

Показать описание

In this video, we have a look at the best free speech to text APIs and also at the top open source libraries for speech recognition!

Converting speech to text is an exciting but also challenging task. Luckily there are existing solutions available that we can use. We can either use a speech-to-text API, or an existing open source engine. Before we have a look at the best best free solutions, we also go over the advantages and disadvantages of both approaches.

APIs:
Google Speech to Text
AssemblyAI
AWS Transcribe

Open Source Libraries:
DeepSpeech
Kaldi
Wav2Letter
SpeechBrain
Coqui

AssemblyAI

Рекомендации по теме

Комментарии

100 beers to the one who can do this:

Compare a new 10 second selfie video against a private encrypted database of similar videos and determine some results/outputs from video and audio.

Elements:
1. The video selfie in portrait mode - person has to be decently placed within the video frame and decently illuminated.
2. The audio from reading of a random short phrase (must be readable 3-5 seconds) in the native language of the person (language can be selected from user input). Phrase can be randomly generated by AI, must never be the same phrase and the phrase prompted must match what the person reads, so audio must be analyzed.

If the subject head is framed properly and illuminated properly, recording will start automatically.
Within the 10 second recording, the person making the selfie will be prompted to read the random phrase out loud (in native language).

The sound must be analyzed in real time so that the phrase read by the human is converted from speech to text and the output must match the sentence prompted by 90% accuracy or more, or he/she has to start all over again.

RESULT:
The result of each new comparison initiated when a new selfie video taken is compared against this database has to be an answer to these 2 simple cascading questions.

1. Is the subject in the video a human being? true or false - accuracy must be over 90% - cannot be fooled by manikins or by very obvious recordings played on another screen
^ this is required before saving the video to the encrypted database.

2. Is the subject a different human compared to the subjects from all other videos by analyzing both video image for face ID and sound for vocal timbre? if it's not different, must output all matches by @username value.
^ this is also required before saving the video to the encrypted database.

I am open for suggestions to increase accuracy and prevent this system from being fooled/hacked. Also, let's make it open source. I can help with the front-end and hosting.

dyablohunter

Do you provide timestamps of the transcribed audio file?

clearthinking

Thank you for helping developers enhance humanoid robots. The hardware design and construction seems much easier than speech recognition and action generation.

michaeltyborski

Let's say I wanted to integrate a voice to text feature on my next js application and the voice that is going to be recorded is not in Englis (it is amharic). which of this solutions will fit best for me?

paulos_ned

Thank you this has helped me a lot. I couldn't find a good API anywhere, now I know where to look!

bouzz

Where are the 3 free monthly hours mentioned on the AssemblyAI website? I can't find it anywhere

zRedPlays

do you know a powerufl speech to text with timestamp and speaker diarization for ICELANDIC i need it for class project

spider

This was a surprisingly helpful video. Thank you very much! If you're looking for suggestions for videos topics, might I suggest looking at different 'use cases' and how you might go about implementing your API in those scenarios.

KallunWillock

I am looking for a simple basic Speech to Text File (TXT) for Windows OS which only dictates mainly numbers & letters but need to write to a text file (e.g. on a new line) on every update, can anyone tell me is such thing exist & ready made, without developing from scratch?

score

It seem like Assembly a lot budget than google, how about performance comparing ?

viewview

Is there a tool for videos? I downloaded videos and need to have them translated to my elderly family member from English to Ukrainian/Russian.

aperson

I couldn't find a suitable API anyplace till today, but now I know where to go.!

mysha

Amazing video, very informative and helpful, thanks!!

luisxd

Thanks for this video.🙏 Speech to text transcribe open source library using python? (Completely open source) if you know please let me know.

balajicmb

Why the audio intelligence is $0.000583 now? Increase from 0.000167. It increase too much, right?

ChrisWong

Please make a video on how to train voice model stem by step in pycharm, please sir make video on this

pratugames

I was just searching for something like this! Thank you🙏🏼

jsebastianmunch

Thanks for this .. great we got to see this (i saw this on ad, but in future i hope it won't need)
Wish to get more reach soon 😃🌟✨🙌
Man please make modulations & certain intervals while speaking and you are already doing great 😃
15.09.2022 09:59 pm ist

azhagurajaallinall

Hi, Assembly AI. Would you be interested in having your videos captioned? :-)

diycaptions

who have timestamp foe every word for free ??? pyton have but import only word with out timesamp!!!

IndigaVP

Best Free Speech-To-Text APIs and Open Source Libraries

Best Free Speech-To-Text APIs and Open Source Libraries

Best FREE Speech to Text AI - Whisper AI

Live Speech to Text with Watson Speech to Text and Python | FREE Speech to Text API

Best FREE Speech to Text AI in 2024

SUPER Fast AI Real Time Speech to Text Transcribtion - Faster Whisper / Python

Google Cloud Speech-To-Text API With Python For Beginners

7 Questions to Help you Choose the Right Speech-to-Text API

OpenAI Whisper Demo: Convert Speech to Text in Python

Free Ai Avatar Generator || ai avatar generator free || ai character generator || ai talking avatar

Creating a Speech to Text Program with Python

Voice Typing Changes Everything - So much more than Dictation!

Google Speech to Text API with PHP & cURL

Free Speech To Text Google Chrome Extension!

World’s Fastest Talking AI: Deepgram + Groq

How to Install & Use Whisper AI Voice to Text

Boosting Speech-to-Text API accuracy

6 AI Text-To-Speech Voice Generators For YouTubers (Free Forever)

Deepgram Speech-to-Text (STT) API Overview

Free Text to Speech API || Build Text to Speech App

What's The Best Free Speech Recognition Tool (Voice Talk To Text) For Windows (Win 10 Win 11)?

How to do Free Speech-to-Text Transcription Better Than Google Premium API with OpenAI Whisper Model

Converting speech to text with Node.js

The Ultimate Guide to Free Text to Speech AI

The Top 10 Best AI Voice Generators 2024