Conformer-1: a new large scale/robust speech recognition model

preview_player
Показать описание
We're introducing Conformer-1, a state-of-the-art speech recognition model trained on 650K hours of audio data that achieves near human-level performance and robustness across a variety of data.

Our results demonstrate that Conformer-1 is more robust on real-world data than popular ASR models, making up to 43% fewer errors on noisy dataI, and achieving state-of-the-art results on a wide variety of academic and real-world datasets compared to other ASR models.

▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

#MachineLearning #DeepLearning
Рекомендации по теме
Комментарии
Автор

Whats the benefit of using this model if its not an free and open-sourced one? I can run Whisper on my own machine without any extra cost except for electricity.

魏一傑
Автор

could you add language picket in the dropdown? uploaded a german video but it was gibberish, but in the documentation I saw that you guys support it

Jingizz
Автор

This is brilliant! I started following Steven Brenton’s work (Univ Washington), maybe 7 or 8 years ago, when he started developing the concepts of sparse identification of non-linear dynamics (SINDy). I think sparsity, as he developed it, is going to make almost everything we do more efficient. I think it will also give us insight into AGI, because our memory and cognition both depend on sparse attention.

michaelzumpano
Автор

How does it compare with OpenAI's Whisper? Whisper was trained on 30k more hours than this model.

danielmonge
Автор

I really wanted to see the same audio with competitor models lol. That would have been a great juxtaposition.

AlexFord
Автор

Nice to know that it can recognize Lebron's speech!!!!

yosolonopuedo
Автор

In the playground, can you add a module where we can listen to the speech devoid of noise, that was used to transcribe accurately.

basrurkrishna
Автор

What's the pricing for conformer 1 ?

lazy_iitian
Автор

So I tried it with a German youtube video and it got almost everything wrong in the transcription, so unless you speak as clearly as possible, for example in a podcast, and go slow, you will have a hard time

all AI models for some reason specialize in american english with a bit of other languages in between

kipchickensout
Автор

Hello Misra, I have a question. Generally, how are these giant models trained? I mean it requires a lot of GPUs right?? How do they manage the infrastructure to train these type of giants??

nivasgopi
Автор

Awesome! However, I miss an option to export to a subtitle file in the playground. Any plans to implement this?

guepardiez