Все публикации

Simple and Effective

Simple and Effective Unsupervised Speech Synthesis

PnG BERT: Augmented

PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS

Universal Paralinguistic Speech

Universal Paralinguistic Speech Representations using Self-Supervised Conformers

Geoff Hinton Saying

Geoff Hinton Saying Deep Neural Network in 8 different languages...

Obama speaking in

Obama speaking in Geoffrey Hinton's voice [ MADE USING AI ]

Obama speaking in

Obama speaking in Geoff Hinton's Voice...[ MADE USING AI ]

OBAMA SPEAKING IN

OBAMA SPEAKING IN LADY VOICE......

A Comparison of

A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion

Simple and Effective

Simple and Effective Zero-Shot Cross-Lingual Phoneme Recognition

Token-Level Supervised Contrastive

Token-Level Supervised Contrastive Learning for Punctuation Restoration

SUPERB: Speech processing

SUPERB: Speech processing Universal PERformance Benchmark

Speech Emotion Recognition

Speech Emotion Recognition with Multi-task Learning

A Light-weight contextual

A Light-weight contextual spelling correction model for customising transducer-based ASR systems

Injecting Text in

Injecting Text in Self-Supervised Speech Pre-Training (TTS4PreTrain)

W2V-BERT:Combining Contrastive Learning

W2V-BERT:Combining Contrastive Learning and Masked Language Modelling for Self-Supervised Speech

HuBERT: Self-Supervised Speech

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

A Unified Transformer-based

A Unified Transformer-based Framework for Duplex Text Normalization

Interspeech 2021: Using

Interspeech 2021: Using Large Self-Supervised Models for Low-Resource Speech Recognition

INTERSPEECH2021: Using Large

INTERSPEECH2021: Using Large Self-Supervised Models for Low-Resource Speech Recognition(3 min Intro)

Improved language identification

Improved language identification through cross-lingual self-supervised learning

Unsupervised Speech Recognition

Unsupervised Speech Recognition (Wav2Vec-U)

Obama Says 'EVERY

Obama Says 'EVERY 60 SECONDS IN AFRICA A MINUTE PASSES'

MLP-Mixer: An all-MLP

MLP-Mixer: An all-MLP Architecture for Vision + Code

Confidence Estimation for

Confidence Estimation for Attention based Sequence to Sequence Models for Speech Recognition

welcome to shbcf.ru