How to Clone Any Voice With AI 🔊 | Tutorial | Tortoise-TTS

Показать описание
The speech of deepfakes are created by using a text-to-speech model to generate speech from text. Once a model is trained, it can be used to generate speech with any voice. Usually such models are separated into voice encoder, synthesizer and vocoder. A voice encoder learns to create a latent, fixed-dimensional embedding (vector) that captures various features of a particular human voice. The synthesizer learns to create a mel-spectrogram from a text transcript for a specific voice. The vocoder generates an audio waveform from the mel-spectrogram.

In this video, I introduce you to the theoretical background of text-to-speech synthesis and show you how you can create speech yourself with any voice you have access to.

My Medium Article for This Video:

00:00:00 Intro
00:01:25 Single-Speaker vs. Multi-Speaker
00:02:14 Multi-Speaker Approach
00:02:31 Speaker Encoder
00:03:55 Synthesizer
00:04:25 Mel Spectogram
00:05:31 Vocoder
00:06:26 Model Summary
00:07:29 Hands-On Voice Cloning
00:09:36 Speech Generation
00:15:03 Outro

I'm happy about any feedback I can get. :) So feel free to share it with me in the comment section, thanks. :)
Рекомендации по теме

Please make a copy of it so you can make changes to it. Feel free to share your results with me or comment below on how you liked the quality of the generated speech. 🙂


Very impressive. The voice cloning result is more accurate compared to other colab notebooks. The intonation sounds even less robotic than the Descript Overdub.


Really??? you did this good on your first video!!! keep up the good work, looking forward to your second video


Did he just say he's new to YouTube?damn I just thought he's so damn experienced because of the quality video provided 😍thank youuu so much


Did great man! very thorough, thank you. I developed Squamous Cell Cancer and lost 90% of my tongue, my lymph nodes and much of my throat. My speech is heavily impaired now and will never be the same. I used to be a singer/songwriter and spoke very well. Tools like this will help me regain some of my former abilities, even if in a virtual setting. Cheers


Many thanks for the tutorial, dude (Also, did anyone tell you you look like a younger verion of Matt Fradd?)


Thank you for this program, Martin! Whenever I run it, everyone has a British accent. Even when I upload five audio clips, it sounds like I'm from the outskirts of London.


thanks for the full tutorial, as an artist very interested in the topic been looking for a article or video just like this. Kudos to YT for recommending this video and to you for spending time and effort to make this. thanks m8


This is your first!? It's really well done.


Hey, loved the way you briefly explained all the blocks. Keep it up!!


I genuinely thought that I could make my own AI voice generator using this. I recorded an AI voice from a website and put it into this model. It works great, but each sentence takes almost 2 minutes to process. 😂 Thanks for the video! Keep it up.


Congrats to putting out your first video, I think you did a great job. You explained everything without going off topic, you walked us through step by step and made something that looked complicated look very easy. Thank you for this. My only recommendation would be to change your tone every now and then or add "passion" to your voice, rather than keeping the same tone. I could go to sleep with your voice :D


Great walkthrough. Good luck on your channel!


Hi Martin, Very valuable content. Congratulations for your first video. In my opinion, you are using the right format for this kind of videos/tutorials. Maybe make introduction shorter and give a thriller about the final result you will get so that you incentivize people to stay to the end. You can also include a call to follow the channel.


This is much better than voice cloning but you have to read some weird senteces first to generate audio samples.


Hi Martin! Hammer erstes video auf yt 😄 Finde du hast das ganze Thema super erklärt und ich hab zu Beginn nichtmal wahrgenommen dass du einen akzent hast, bis du es angesprochen hast 😅 Das Video ist genau das was ich gesucht hatte. Eine kostenlose alternative zu den ganzen anderen AI voice clonern, die jedoch kostenpflichtig sind oder teil von einem großeren Produkt.
Und ganz ehrlich, mein bester Tipp überhaupt für youtube: Hör nicht auf alles was die Leute dir empfehlen. Mach dein Ding, deinen Style und bitte bleib fern von den ganzen sinnlosen youtuber tipps & trends á la "du musst alle 2 minuten was neues zeigen, neue perspektive, weil leute heute schnell gelangweilt sind; mid-video aufruf zu abonieren mit einem unerträglichen glocken sfx & animierten icons"-bs. Mach deine Videos so wie du es für gut hälts und selbst zufrieden bist!


Great video. Thanks for putting in the time to share in a clear and organized manner. The guide was easy to follow. Don't sweat the accent, you did a great job.


Hi Martin - thank you so much for this video! It is a useful intro to the subject. The chapter titles are especially handy! It would also be great if you could mention how much time the different sections in the Colab notebook took to run, and what kind of hardware you are running it on.


that was professional, im surprised that you don't have like a million subs


Gute Präsentation, frei gesprochen ☺️😋

Spaß 😝😂

Mega Video, bin gespannt auf mehr ☺️🙌🏻
