Conformer-1: a new large scale/robust speech recognition model

Показать описание

We're introducing Conformer-1, a state-of-the-art speech recognition model trained on 650K hours of audio data that achieves near human-level performance and robustness across a variety of data.

Our results demonstrate that Conformer-1 is more robust on real-world data than popular ASR models, making up to 43% fewer errors on noisy dataI, and achieving state-of-the-art results on a wide variety of academic and real-world datasets compared to other ASR models.

▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

#MachineLearning #DeepLearning

AssemblyAI

Рекомендации по теме

Комментарии

Whats the benefit of using this model if its not an free and open-sourced one? I can run Whisper on my own machine without any extra cost except for electricity.

魏一傑

could you add language picket in the dropdown? uploaded a german video but it was gibberish, but in the documentation I saw that you guys support it

Jingizz

This is brilliant! I started following Steven Brenton’s work (Univ Washington), maybe 7 or 8 years ago, when he started developing the concepts of sparse identification of non-linear dynamics (SINDy). I think sparsity, as he developed it, is going to make almost everything we do more efficient. I think it will also give us insight into AGI, because our memory and cognition both depend on sparse attention.

michaelzumpano

How does it compare with OpenAI's Whisper? Whisper was trained on 30k more hours than this model.

danielmonge

I really wanted to see the same audio with competitor models lol. That would have been a great juxtaposition.

AlexFord

Nice to know that it can recognize Lebron's speech!!!!

yosolonopuedo

In the playground, can you add a module where we can listen to the speech devoid of noise, that was used to transcribe accurately.

basrurkrishna

What's the pricing for conformer 1 ?

lazy_iitian

So I tried it with a German youtube video and it got almost everything wrong in the transcription, so unless you speak as clearly as possible, for example in a podcast, and go slow, you will have a hard time

all AI models for some reason specialize in american english with a bit of other languages in between

kipchickensout

Hello Misra, I have a question. Generally, how are these giant models trained? I mean it requires a lot of GPUs right?? How do they manage the infrastructure to train these type of giants??

nivasgopi

Awesome! However, I miss an option to export to a subtitle file in the playground. Any plans to implement this?

guepardiez

Conformer-1: a new large scale/robust speech recognition model

Conformer-1: a new large scale/robust speech recognition model

Audio-Visual Efficient Conformer for Robust Speech Recognition

Conformer-2: A state-of-the-art speech recognition model

Conformer: Convolution-augmented Transformer for Speech Recognition #nlp

[Short Review] Conformer: Convolution-augmented Transformer for Speech Recognition

[Long Review] Conformer: Convolution-augmented Transformer for Speech Recognition

Conformer or Transformer?

Chirp: Automatic Speech Recognition for 100+ Languages | Research Bytes

Universal Paralinguistic Speech Representations using Self-Supervised Conformers

[Olewave's Review] OpenAI's Whisper ASR: Robust Speech Recognition via Large-Scale Weak Su...

Worst Prediction in all of Physics - Vacuum Catastrophe

Auto Speech Recognition Tutorial, Tools Testing: OpenAI Whisper, Nvidia Conformer, SR, Deepgram, Sps

Interspeech2022-Attention enhanced citrinet for speech recognition

NLP Deep Dive, Paper Reading: Robust Speech Recognition via Large-Scale Weak Supervision (Whisper)

Improve speech recognition AI model: Adaptive Multi-Corpora Language Model Training (Meta-AI paper)

Fall2022-SpeechRecognition&Understanding (Lecture19 - End-to-End ASR: CTC)

Conformer to Transformer by Pete Savage

AssemblyAI Product Overview

Dan K2 #28 RNNT and Conformer BAAI Conference P8 Q5

Challenging the Big Bang: The Struggle with Inferred Reality

MIT 6.S191: Automatic Speech Recognition

Nepali Speech Recognition Using Conformer CTC model

DeltaLM: Encoder Decoder Pre training for Language Generation and Translation

LTI Colloquium - New Approaches to Natural Conversation Transcription