Kokoro Local TTS + Custom Voices

Показать описание

Kokoro it's a small TTS model that's really high-quality that can be run both in Colab and locally very easily.

For more tutorials on using LLMs and building agents, check out my Patreon

🕵️ Interested in building LLM Agents? Fill out the form below

👨‍💻Github:

⏱️Time Stamps:
00:00 Intro
01:04 TTS Arena: Benchmarking TTS Models
02:06 Kokoro Model Card
03:05 Kokoro Onnx Github
03:50 Colab Demo
07:27 Blending Custom Voices
10:58 Kokoro Onnx Demo (Installing Locally)

Рекомендации по теме

Комментарии

Thanks, have realized over the last few days that the thing we need (as a start) for many models are simple guides like this to get started

KrullMaestaren

hmm Tiny TTs is definitely an interesting name

andherium

XTTS v2 still the best so far. I'm using XTTS v2 since last year and I'm surprised there isn't another TTS that can compete with it. Not only it has a lot of voices that sounds really good but also they are multilingual, so they sound good on most languages, including spanish and other ones that most of the models don't have. Oh, and its fast even using it with CPU only.

ElChapoDel

I've been waiting for this for so long. Being able to turn any PDF/text file into an audio book should have been possible so long ago.

kevin.malone

love to see video on conversation with local agents

mageshyt

Very cool idea! I made a branch of hexgrad's current repo that incorporates a weights option natively, and allows mixing an arbitrary number of voices. Pull request submitted.

In any case, thanks! I like Kokoro a lot and wanted any ability to slightly tweak the voices given the limited set available. With this I was able to dial a couple in just a little bit more to my liking, and it's super simple.

timm

Wht we need is is a model that gives precise control over the emotion, intonation, cadence, pacing, volume, timing and pitch of the voices, not more monotone models.

jmg

One potentially cool application of blending would be to blend between voice styles like laughing, crying, angry, etc, based on what's being said (maybe with a small llm) and other things.

CapsAdmin

I really liked this video! Any plans to also make a video about "training" your own embeddings for the model with your own data? Would love to see an easy tutorial for that 😉

bastothemax

This would be good for people that want to run something like Alexa locally at home. I know some people have been putting together systems for home assistant. While maybe the OpenAI integration might sound slightly better I'd consider this more than good enough to replace that and not have to send your data to OpenAI.

pin

interesting, the interpolation part shocked me, thanks

sajjaddehghani

Great overview! I was curious if anyone has used this in a local voice chatbot and if the processing time is fast enough to use realtime.

mikew

Thanks.
You have given me another reason to buy a Mac mini M4 😉

khangvutien

Should i train this model for local language for assamese

mehdiaslam

Would be great if you could just clone a voice like the other tts with reference mp3s

ScriptGurus

Is there anyway to get it to pronounce words correctly? It's not able to pronounce "live" as I live in a house any different from "that is a live wire"? I am sure this isn't the only problem, but it is common enough to make it a show stopper for articles and ebooks.

gibsononbooks

Would love to see you host the whole project locally and use it.

Zyphorix

Please help, How can we deplywnd run on Windows?

XITIJTHOOL

Is it possible to train own model for some language other than US from scratch?

helloworld

Hey there. Is your colab link still working? It's not for me. Thanks!!

ChatSites_io

Kokoro Local TTS + Custom Voices

Kokoro Local TTS + Custom Voices

How to setup the BEST Free Text-to-Speech Locally 💥 Kokoro TTS Local Installation 💥

'Kokoro v0.19: #1 Ranked TTS Model with Realistic Voices | Easy Local Installation!'

Kokoro TTS in ComfyUI - A Lightweight Text To Speech AI Model Running Locally

AI Voice Mixer Studio - Kokoro TTS - Install Locally

Most Realistic AI Text to speech! 100% Free, No Copyright, Offline | Kokoro TTS v1 - Easy Setup

Kokoro TTS with custom voices | Best open source TTS model

Kokoro TTS Current Best Open Source TTS - Better Than ElevenLabs?

ComfyUI Tutorial Series Ep 33: How to Use Free & Local Text-to-Speech for AI Voiceovers

FREE Local Elevenlabs Alternative Text To Speech App With Adam Voice

ChatTTS - Best Quality Open Source Text-to-Speech Model? | Tutorial + Ollama Setup

Run Text-to-Speech Locally: Step-by-Step Guide

TTS with FREE Voice CLONING!!! 💥 Full Text-to-Speech with Voice Cloning Tutorial 💥

My Top 5 Open-Source AI Text-to-Speech Models

My Top 5 Open Source Text to Speech Softwares Starting off in 2024

F5-TTS! They DID IT! Perfect voice clone with Emotion with a 10-second sample!

Build a Talking Smarter-Than-You AI Girlfriend (DeepSeek R1 Tutorial)

This free AI Text-to-Speech is insane! Add emotions & make podcasts

【Demo】AI Avatar - Lip-sync, Text to Speech #Live2D

The Free Open-Source Text-to-Speech Model That Rivals Premium Options | Simplify AI

Not ElevenLabs, This new #1 Text to Speech AI is FREE!!!!

Generate an Hour of Audio Per Minute with Kokoro 82M Locally - FastKoko

This FREE AI Voice is Better than ElevenLabs (Kokoro-82M)

How to Train & Install F5 TTS - New Language and Single Speaker Voice Clone