1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

Показать описание

Mixture of Experts explained, well, re-explained. We are in the Fine-Grain era of Mixture of Experts and it's about to get even more interesting as we further scale it up.

This video was sponsored by Brilliant

Check out my newsletter:

Special thanks to LDJ for helping me with this video

Mixtral 8x7B Paper

Sparse MoE (2017)

Adaptive Mixtures of Local Experts (1991)

Gshard

Branch-Train Mix

DeepSeek-MoE

MoWE (from the meme at 7:51)

Mixture of A Million Experts

This video is supported by the kind Patrons & YouTube Members:
🙏Andrew Lescelius, alex j, Chris LeDoux, Alex Maurice, Miguilim, Deagan, FiFaŁ, Robert Zawiasa, Daddy Wen, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richárd Nagyfi, Timo Steiner, Henrik G Sundt, projectAnthony, Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä, SO, Richárd Nagyfi, Hector, Drexon, Claxvii 177th, Inferencer, Michael Brenner, Akkusativ, Oleg Wock, FantomBloth

[Music] massobeats - daydream
[Video Editor] @Askejm

Рекомендации по теме

Комментарии

Like this comment if you wanna see more MoE related content, I have quite a good list for a video;)

bycloudAI

Imagine assembling 1 milliont PhD students together to discuss someone's request like "write a poem about cooking eggs with c++". Thats MoE irl

progameryt

The only thing in my mind is "MoE moe

maickelvieira

to some extent this seems closer to how brains work

gemstone

i see what you did there with "catastrophic forgetting" lmao 🤣

randomlettersqzkebkw

1991... We are standing on the shoulders of giants.

ChristophBackhaus

It's crazy how Meta's 8B parameter Llama 3 model has nearly the same performance as the original GPT-4 with 1.8T parameters.

That's a 225x reduction in compute in just 2 years.

GeoMeridium

I watch your videos yet I have no idea what you are explaining 99% of the time. 🙃

Saphir__

Now I really am excited for a 800B model with fine-grained MoE to surface that I can run on basically any device.

Quantum_Nebula

These videos format is GOLD 🏆 such specific and nerdy topics produced as memes 😄

AkysChannel

3:37 wasn't it just yesterday that they released their model 😭

lazyalpaca

Thank u for linking the papers in the description ❤

farechildd

I watch you so that I feel smart, it really works!

cdkw

Damn.. You blew my mind on the 1 million experts and Forever learning thing

simeonnnnn

In a very real sense, the MoME concept is similar to diffusion networks. On their own, the tiny expert units are but grains of noise in an ocean of and the routing itself is the thing being trained. Whether or not it's more efficient than having a monolithic neural net with simpler computation units I dunno. I suspect like most things ML, there is probably a point of diminishing return.

KCMNJL

Was hoping someone would make a video on this! Thank you! Would love to see you cover Google's new Diffusion Augmented Agents paper.

pathaleyguitar

Yo dog, I heard you liked AI so we put an AI inside your AI which has an AI in the AI which can AI another AI so that you can AI while you AI.

shApYT

Mixture of a million experts just sounds like a sarcastic description of Reddit

NIkolla

Bro, but did you read about Lory? It merges models with soft merging building on several papers. Lory is new paint on a method developed for vision AI to make soft mergers possible for LLM's. ❤

ickorling

Actually really cool idea, i liked the deep seek meo version too, it's so clever

soraygoularssm

1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

Amazing Milestone! Million Experts Model

Explorer's family could have difficulty winning lawsuit against Titan sub owner, expert says

The Oak Island Crew Made MILLIONS After Selling The Oak Island Treasure

The Cutest Shopping Expert🤣 #1million #funny #adorablebaby #cutegirl #cutestbaby #food #play #toys...

Bloxburg AFK farm for beginners/experts up to 3 million a day

Our Fantasy experts create their ULTIMATE Gameweek One team | Fantasy Show

Top cybersecurity experts in town for Black Hat USA 2024

Angle Mastery: Elevate Your Boxing Game with Expert Techniques(1 MILLION VIEWS)

New AI ROBOT with 3 Brains SHOCKED Experts!

TOP 17 MOST EXPENSIVE LINCOLN PENNIES WORTH UP TO $9 MILLION DON'T SPEND THESE!

Investing 101: ‘DO NOT SELL on a down day,’ expert says

1 million to 10 million in ONLY 5 years - EV expert predictions CHANGE

Experts dealing with infestation of Island Apple Snails in parts of South Florida

How to Make Millions with AirBNB Arbitrage: Expert Tips from Quentin West (Q Deals Homes)

MY NEW FPL DRAFT(S) FOR GAMEWEEK 1 | Fantasy Premier League Tips 2024/25

Tokyo Mega-Earthquake Risk Just Elevated

FPL Experts PREDICT Next Cole Palmer! | FANTASY PREMIER LEAGUE 2024/25

[Black MIDI] Klonoa Expert Mode 11 Million Notes

Expert awarded $5 million for debunking Mike Lindell's election data

Copywriting Expert's #1 Tip (Behind Million $ Launches)

How To Make Millions Just By Talking: Six Year Old Girl OWNS $50,000,000.00 Marketing Expert

Million-dollar AI ransom scams on the rise, expert says 'just a small clip is needed' | Ne...

Was Ohio's Vax-a-Million campaign successful? In a way, experts say