Introduction to Whisper Fine Tuning Event

preview_player
Показать описание


Speaker: Sanchit Gandhi, Research Engineer, Hugging Face

In this talk, Sanchit outlines the Whisper Fine Tuning Event and gives tips and tricks on how to train and evaluate Whisper with Transformers, Datasets, and PyTorch.
Рекомендации по теме
Комментарии
Автор

Now that this is done and people have experience fine-tuning this model, is there a summary of recommended training hyperparameters to use in different situations?

dgram
Автор

hi, we have 12M names and we would like to fine tune whisper on them. also, i am happy to share with you the results.

the question is it better to fine tune whisper using the entire spoken name? Or is it better to fine tune using invidial names and recording snippets of each anme spoken?

callmydoc
Автор

Hi, Is Whisper an SSL feature that could be used for my downstream task voice conversion. I am working on low resources languages. I tried different feature extractors like wavLM, HuBERT, contentVec. It seems like I m losing content information a bit. I even tried XLSR which seems to be slightly better but has speaker information. Could someone please suggest which SSL feature would be best to use for indian languages.

lnschandrakanth
Автор

It can't be fine-tuned. You can get your cer or wer down to 1 but in practice your fine-tuned model will be hot garbage.

sinpie