A BETTER RVC Training Method for AI Voices?

preview_player
Показать описание
Links referenced in the video:

Hardware for my PC:

Alternative prebuilds to my PC:

Cheapest and PC recommended:

Come join The Learning Journey!

If you found anything helpful, please consider supporting me and the content I am trying to produce!
Рекомендации по теме
Комментарии
Автор

Do you mean for training you provided a training dataset directory with just one large audio file for the process data step in RVC?
Or you had multiple audio files each from a YouTube videos etc that you fed into process data step?

TheNoobyworld
Автор

I was thinking this is the default way to train 😂. I didn't really fed multiple snippets or things like that and just did 1/2/3 giant files as datasets and everything works well. Also I have a small question: if u were to guy a gpu to train & use, which one that would be decent and also affordable (relative mid to up entry?). I would like to buy a new gpu and idk which one to get yet. Seen that having a lot of vram seems to make it great but I've seen some video comparations before and sometimes depends. Another question would be: how does an AMD Radeon 7800 XT do when training?

mirage_zoe
Автор

Hello my friend, I need your help. I have a model that has been trained and I have 500 votes and I want to transfer them all at once. How can that be done? Of course, the duration of the votes does not exceed 10 seconds.

aji
Автор

What if I want an AI voice which is capable of speaking calm, like reading a book out and loud, and also who is able to scream out loud, like a football coach? Should I feed for training both calm and screaming footage in one session?

HR-zgci
Автор

Unfortunately not sure what's being discussed here, cuz i just manually divide/export selected tracks of the dataset in Audacity to anything smaller than 10 seconds and avoid selecting silent parts.

AlexanderKuznetsovAKASergei
Автор

Hey Jarods, just give us or make a video on how to fine tune the model. Hoping the positive response from your side.

prateekkumarsingh
Автор

and what is the advantage of having more tensor cores, if I have 384 on my 4080, does it affect my batch size, so if I can upgrade my vram, I mean I have to use more tensor cores, to be able to get up at a higher batch size?

denblindedjaligator
Автор

Thanks for your videos 🙏🏻 quick question, is it possible to train a model using say, 2 files of ~1 hour each (voice only)?
is the workflow the same?

Sebax
Автор

How can auto detect the index file? I am using a screen reader and i have to brows the compobox. What is the singer and speaker id?

denblindedjaligator
Автор

I have found a model that is trained with the setting False. When I change the pitch, it will affect the model. how did you do it, because if I make a model and set it to false, I can't do the same. imagine an autotuner here.

denblindedjaligator
Автор

Hi Mike. if i make a module that doesn't respect pitch how can i get it to change pitch when i transbone it i have a daft punk vocoder module and when i transpose it will work the module even if it has no pitch i can make a recording of it if you like. do you have dropbox?

denblindedjaligator
Автор

Hey Jarods, you got a good recommendation of a proper Google Collab link? I still couldn't get my Mac to locally install the RVC till date and no one seems to be able to offer any help over at Discord since, so I guess I'll have to go with the online thingy instead. There're quite a few different links out there, and many of them don't really do a good job; they either hang or timeout halfway, error messages kept showing up every now and then. Just can't train a model properly!

macdoctorsg
Автор

if i make a module that doesn't respect pitch how can i get it to change pitch when i transbone it i have a daft punk vocoder module and when i transpose it will work the module even if it has no pitch and what batch size should i train on

denblindedjaligator
Автор

What settings do you recommend for a GTX1060?
I don't have the money to get a brand new card so i'm going to use my current one

KuletXCore
Автор

What happens if you clump all the small splits into one big audio clip and then train the model from there?

Matchstickn
Автор

Hello, when I use collab, the most time-consuming connection is suddenly disconnected, and when I try to connect again, it does not work at all. This issue has been happening to me for several days, please give me a solution so that I can use it again.

eventfakt
Автор

Hi! Thank you for sharing your knowledge. Do you kknow shich sientific paper supports rvc? I'm working on a paper where I need to cite RVC, but I'm not finding the correct paper to cite. Thank you!

JonathanSantosDeveloper
Автор

Hey I am struggling to make the voice sound clear there's always artifacting or whatever its called ( distortions in vocals, not for every word but some words ) .. I am using usually 5-10 minute files in my training folder .. about 3 files. the audio is kinda clear theres not much background noise, maybe the mic quality isn't perfect but it's still really scuffed.. any tips to make it sound clearer? or some videos? Should i separate it into smaller audio files? should i do more audio file processing before I start training? should i train for longer than 400 epochs? I'm lost :c

deadvesu
Автор

How much space should I expect to use if I’m new to all this? I’m getting a new computer but want to make sure I get the right amount of space on it… if you don’t mind me asking

HappyHostages
Автор

How do I work with more than one speaker or more than one model? I'm trying to keep the model I've already trained but start a new model or speaker and be able to select between the two and maybe add more later. Do I need to clear out any files from the first training before I begin the new one, etc.? I tried to add a second speaker but it seems that the "training" went way too fast to have actually worked and the vocals.out.wav is the voice from the first speaker.

magickey
welcome to shbcf.ru