DEEPFAKE Tutorial: Beginner guide: cloning voice and lip-synching in video (tortoise TTS & wav2lip)

preview_player
Показать описание
Hi, I am Alex Borg and in this tutorial, I will guide you step by step on how to generate a deepfake voice using Tortoise TTS and Elon Musk's voice, and how to synchronize it to a video spoken by Elon Musk.

I am a virtual YouTube AI and my body and voice are virtual.

We will look at how to create a deepfake by changing the words spoken by a person in a video. For example, let's use Elon Musk as a celebrity. Here is the text for Elon Musk to read: "I had said that the Doge cryptocurrency was cool, but I just changed my mind, I'm going to sell everything and instead buy Elrond Gold (EGLD), it's a very promising cryptocurrency with low transaction fees, just like Ethereum."

Follow me on:

For cloning and generating voice, we use Tortoise TTS.
Source code and explanations can be found here :

To use Google research Colab directly online, you can use this :

For generating videos with lip-synching, we use wav2lip.

You have to add a model in model folder, here is the

- - - - - - - - - - - - - - - - - -
The solutions I show in my video are forks of tortoise TTS and wav2lip.
If you are interested for a link of these modified versions, working as stand alone without any other installation : here are the direct links below :

Рекомендации по теме
Комментарии
Автор


For cloning and generating voice, we use Tortoise TTS.

To use Google research Colab directly online, you can use this :

For generating videos with lip-synching, we use wav2lip.

The solutions I show in my video are forks of tortoise TTS and wav2lip.
If you are interested for a link of these modified versions, working as stand alone without any other installation : here are the direct links below :


Tutorial if you have troubles to use it after installation on your computer :
-install Cuda 11.3 (2, 7 gb) ! Here is the link you have to download (Windows -> x86_64 -> V10 -> .exe (local)
- run "start.bat", wait for automatic download of models,
- write your text (enter), choose voice "halle" (enter), enter 2 (to get 2 versions of result)
Wav Files could then be found in subfolder : (halle is the name of the choosed voice for me)
PS : when you choose another voice after entering your text, voices names are the folders into "C:\conda3\tortoise\voices"



Wav2lip + GUI modified by Romain Baker / full ready folder with "start.bat" to launch app under windows 10 64 bits (tested with RTX 2060 ok, but not ok with RTX 3060) :

alexborg
Автор

"I won't dwell on this solution, which may put most of you off." Literally the only reason I clicked this video.

WhimKR
Автор

Really awesome to see how far the deepfake is going on. I am sure they will master it one day that you cant see a difference of real or fake. Great video!

SpicedFuture
Автор

Je suis un intéressé par un lien vers ces versions modifiées, fonctionnant en stand alone sans aucune autre installation, stp ? J’aimerai bien voir comment tu les as utilisé. Je suis bloqué sur un point

yvankoabiloa
Автор

Hello, cool video and powerful program. But is it possible to create your own voice package so that it takes fractions of a second to play text-to-speech, as it happens in pyttsx3 ?

johannex.
Автор

How did you generate the human avatar for this video? I’m currently using Synthesia but it’s quite expensive. Wondering if there’s a better tool.

fabianschierz
Автор

That was insane, and thoroughly entertaining. Thank you.

alephd
Автор

Merci, thank you Alex Borg!

Is there any way to make it sound more realistic by any chance?
For example, if you train it with more .wav files? What do you think
Subbed btw

lBioHaZarDl
Автор

Hi! Thank you for that great Tutorial! And thank you for providing that downloadable Folder. <3

Ilovepapayas
Автор

What an awesome program I installed it in anaconda on windowd and it’s FANTASTIC! :)

lazerusmfh
Автор

I always get this error: ['utf-8' codec can't decode byte 0xa0 in position 20: invalid start byte] when I try wav2lip Could you please give me some advice?

gaborjuhasz
Автор

I want to train voice model in hindi language, how to do this, please help....

pratugames
Автор

Please provide minimum instructions on the specifications of the laptop that can be used to run this tool

hengkyyudhiwijaya
Автор

the audio is so out of sync with the lip movements in this entire video

ekkamailax
Автор

very useful.. is there a faster voice clone process? this is good but too slow

hengkyyudhiwijaya
Автор

Got this error..

nvrtc: error: invalid value for --gpu-architecture (-arch)

stevecommand
Автор

Appeared to have a error come up with the pre setup version in the download link when running the voice generator. failed to open nvrtc-builtins64_113.dll. not sure how to resolve this?

michaelbishop
Автор

I just download the two link just now but did not understand which app to open both the link and is there anyone that can put me through it cause no matter how hard I try I still couldn’t do it even after watching the tutorial video over and over again

adexkadex
Автор

How can I make it read from a .txt file instead than putting my text?

hellasenpai
Автор

I keep getting this when I run the sart.bat.
Traceback (most recent call last):
File "tortoise/read_ask.py", line 30, in <module>
tts =
File "C:\Users\willw\Desktop\VoiceClone\tortoiseTTS\conda3\tortoise\api.py", line 246, in __init__
self.vocoder.load_state_dict(torch.load(get_model_path('vocoder.pth', models_dir),
File "C:\Users\willw\Desktop\VoiceClone\tortoiseTTS\conda3\lib\site-packages\torch\serialization.py", line 705, in load
with as opened_zipfile:
File "C:\Users\willw\Desktop\VoiceClone\tortoiseTTS\conda3\lib\site-packages\torch\serialization.py", line 242, in __init__
super(_open_zipfile_reader,
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

atruebiblicalchurch