Create your AI digital voice clone locally with Piper TTS | Tutorial

preview_player
Показать описание
This step by step tutorial shows you how you create your own digital text to speech voice clone using ai and Piper TTS. All is done locally on your desktop computer - no cloud service needed.

Check my tutorial on Piper TTS usage:

Currently this is supported on LINUX with Python version 3.9 - 3.11.

00:00 Intro
01:03 Prepare and install Piper TTS for training
07:19 Prepare Dataset for TTS Training
11:14 TTS Training
16:25 Monitor training progress
19:37 Synthesize test audios while training
27:09 Outro

If you have problems with CUDA based training on Windows @ei23de shared a solution in the comments. Thank you 😊.

Please subscribe to my channel 😊.

---
Рекомендации по теме
Комментарии
Автор

Guude, Thorsten! Thank you for making this video :) I've added it to the Piper training guide already.
I had a previous comment, but I guess YouTube didn't like it containing a link.

synesthesiam
Автор

Thank you for demonstrating this. Now I admit I will need to watch it a couple more times to get my head around it all, but I certainly appreciate your efforts here. Thank you for making the video and sharing your knowledge.

Joe

MyHeap
Автор

Great video! I'll need to rewatch it a few times to nail down the steps, but your effort in sharing this knowledge is much appreciated. Thanks! 👍

OpinionatedReviewer
Автор

Adding my +1 to all the feedback in the comments below - THANK YOU for creating a step by step guide for this. I work with computers and am no stranger to python programming, but still would have found setting this up and getting it working unwieldy and off-putting. I really appreciate you going through this step by step - I (and I'm sure others) need this kind of help! Also, can I just say that I wasn't even aware this was a *possibility* until I saw this vid? :)

NoDebut
Автор

Hi, Thorsten, Thank you for all of these. The experience of hearing tts in my own voice was awesome.

QuokkaBff
Автор

Why have I never thought to hide the venv dir with a "." HA! Thanks for making this tutorial!

truszko
Автор

Big thanks to you Thorsten, finally my Personal home assistant A.I ToniaBot has a wonderful voice now. ❤

juntiamores
Автор

Thank you Thorsten - useful and enlightening. 🙂🖖

whalemonstre
Автор

Great video! I'm learning a lot!Thanks!

renan
Автор

How many hours of voice recordings would you recommend as a minimum for fine-tuning a small/large model to retrieve good results?

TillF
Автор

thank you very much for the hidden env and tutorial!

cloudsystem
Автор

ERROR: Could not find a version that satisfies the requirement piper-phonemize~=1.1.0 (from piper-train) (from versions: none)
why this error came

rokifromhk
Автор

Thank you so much for your guide. This helped me get past some of the problems that I was having. I’m curious if you have a video or know of a video that goes over what to look for during training. Such as recognizing issues when looking at the generative loss or weights and what settings/dataset changes to make to counteract them. Thank you!

broketechenthusiast
Автор

This was great. It would be useful if you could recommend where to get an appropriate model for english speakers to use in place of the german model.

domesticatedviking
Автор

Hello, Sir. Thank you so much for all of these wonderful tutorials. Since Kaggle has recently improved its free tier, could you create a tutorial explaining how to make a notebook on Kaggle for Piper's voice training? Thank you again.

razvanab
Автор

great stuff! thanks a lot. btw, i was wondering if piper can be trained to speak mix languages, for example, mix between english and chinese

audreylin
Автор

Can't have enough of your superb explanation. Just one more step if you do not mind. for every piper run, it is loading the model. Is there a way we can load the model in memory and generate audio via an API. Like how torch-serve does for PyTorch models.

tesfa
Автор

I guess this is great, and thank you for your effort, but I don't understand how you get to where you are at the begining. Should I open some command prompt windows and just type whatever you type the same way ? It would be even better doing some "don't think, do this +this +this" for people who understand nothing about how it works :)

JUKEBOXANIMATION
Автор

Great video, how much audio is required for a decent training though? I have a few hours, is that enough?

RedDread_
Автор

Are there any tools for creating a structured training data set from existing voice clips vs recording new ones by reading specific text?

YKSGuy