How Distributed Training Will Revive Open Source AI

preview_player
Показать описание

Check out my newsletter:

Petals: Collaborative Inference and Fine-tuning of Large Models

A Preliminary Report on DisTrO

INTELLECT-1

DiLoCo: Distributed Low-Communication Training of Language Models

Open DiLoCo

DeMo

This video is supported by the kind Patrons & YouTube Members:
🙏Andrew Lescelius, Ben Shaener, Chris LeDoux, Miguilim, Deagan, FiFaŁ, Robert Zawiasa, Marcelo Ferreira, Owen Ingraham, Daddy Wen, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Penumbraa, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richárd Nagyfi, Timo Steiner, Henrik G Sundt, projectAnthony, Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä, SO, Richárd Nagyfi, Hector, Drexon, Claxvii 177th, Inferencer, Michael Brenner, Akkusativ, Oleg Wock, FantomBloth, Thipok Tham, Clayton Ford, Theo, Handenon, Diego Silva, mayssam, Kadhai Pesalam, Tim Schulz, jiye, Anushka, Henrik Sundt, Julian Aßmann, Raffay Rana, Thomas Lin, Sid_Cypher, Mark Buckler, Kevin Tai, NO U, Gonzalo Fidalgo, Igor Alvarez, Alon Pluda, Clément Veyssière, Sander Zwaenepoel, etrotta, Binnie Yiu,
Matej Macak, c zhou

[Music] Massobeats - floral
[Music] Massobeats - daydream
[Video Editor] Silas
[3D Animation] @F_E_U

[Bitcoin (BTC)] 3JFMJQVGXNA2HJE5V9qCwLiqy6wHY9Vhdx
[Ethereum (ETH)] 0x3d784F55E0bE5f35c1566B2E014598C0f354f190
[Litecoin (LTC)] MGHnqALjyU2W6NuJSSW9fTWV4dcHfwHZd7
[Bitcoin Cash (BCH)] 1LkyGfzHxnSfqMF8tN7ZGDwUTyBB6vcii9
[Solana (SOL)] 6XyMCEdVhtxJQRjMKgUJaySL8cGoBPzzA2NPDMPfVkKN
Рекомендации по теме
Комментарии
Автор

Thanks for the great overview. The importance of distributed training and inference really can't be overstated. The o1 family of models seem to get more expensive to run each new generation. Soon, only large corporations can afford to pay for state-of-the-art LLMs. We need open-source distributed models if we want to keep up.

procedurallygeneratedhuman
Автор

I remember on the 90s early 2000s of distributing net for cracking rc5 and folding proteins... It would be interesting if something of the like was possible to train models globally

vertigoz
Автор

I got bamboozled by that sponsor ad, i thought it was part of the video

jaydeep-p
Автор

There was also some signal processing mentioned in some of the grokking papers where they segregate the fast gradients and slow gradients as the reason for grokking, somewhat similar to Nous is doing.

KshitijKaushik-wg
Автор

nice video! i did my undergrad thesis on federated learning and as soon as i heard about it i saw the potential for distributed training for open source models. let's hope for the best

Darkon
Автор

The release of the GB10 (DIGITS) is gonna be massive for open source AI

connoraustin
Автор

I quite like the new 3d rendered representations in this video but for your avatar, I would highly recommend sticking with your original flat avatar.

le
Автор

2:57 i like how the ai even sound pissed off and resentful of his job as you are continuing to mess with him.

Amipotsophspond
Автор

❔ How does this prevent malicious actors from sharing 'bad' weights?

catfyrr
Автор

Imagine a continuously updated open source model where your computer's help in training installs the updates to your machine. Amazing.

Kopp
Автор

how do you find these papers? do you just review them one by one in arxiv?

yudatriananda
Автор

your contribution to OSS AI is invaluable, God speed dear friend 🔥

setop
Автор

NOUS researchers finding the worst SEO names so we search them with _their_ names (big brain moment)

Beryesa.
Автор

I literally implemented PETALS private swarm yesterday 😆

Older paper but im more interested in distributed inference than training.

thomasstahura
Автор

How does this stand up now that DeepSeek has described their efficiencies?

pippok
Автор

OpenDILoCo is such a masterfully crafted name

marioornot
Автор

I tried making video 1 month ago and holy moly there are tons of work to do even with just 5-6 images (took me like 20-30 hours for a 5-minutes video). I don’t know how this guy and others manage to find and edit so many memes and references. Man I enjoy their videos and I’d love to know how they do that.

ChanhDucTuong
Автор

There's also Distributed Path Composition (DiPaCo) proposed by the same Deepmind research team that would've been cooler if you mentioned it.

I still struggle to differentiate it from DiLoCo though.

aymanbenbaha
Автор

By the end of 2025 will I be speaking to any human beings over the phone ever again?

watsonwrote
Автор

Google talked about federated learning 2 years ago. It's why i didabled thier app

JTient