Mamba Might Just Make LLMs 1000x Cheaper...

preview_player
Показать описание

Would mamba bring a revolution to LLMs and challenge the status quo? Or would it just be a cope that may not last in the long term? Looking at the trajectories right now, we might not need transformers if mamba can actually scale but attention is probably still here to stay.

Special thanks to
- Gifted Gummy Bee
for helping with this video!

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Transformer: Attention Is All You Need

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Efficiently Modeling Long Sequences with Structured State Spaces

Flash Attention

Flash Attention 2

VMamba: Visual State Space Model

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

MambaByte: Token-free Selective State Space Model

Repeat After Me: Transformers are Better than State Space Models at Copying

This video is supported by the kind Patrons & YouTube Members:
🙏Andrew Lescelius, alex j, Chris LeDoux, Alex Maurice, Miguilim, Deagan, FiFaŁ, Daddy Wen, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richárd Nagyfi, Timo Steiner, Henrik G Sundt, projectAnthony, Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä, SO, Richárd Nagyfi

[Music] massobeats - midnight
[Video Editor] @askejm, Lunie
Рекомендации по теме
Комментарии
Автор

Bro dropped the hardest LLM anime edit and thought we wouldnt notice

xthesayuri
Автор

7:20 the greatest LLM anime of all time begins(JJK is within 2 letters of LLM)

Автор

😂 I did not expect The JJK EDIT and died laughing

sleepingArtist
Автор

Good so I guess OpenAI no longer needs 7 Trillion dollars for chip factories.

itzhexen
Автор

That anime edit was one of the sickest media pieces I've seen, but unfortunatelly I have no friends in the intersection of jujutsu enjoyers and ai reaserch conisseurs, who would appreciate it wholly

OxidoPEZON
Автор

"Exponentially" should stop being misused for everything that is bigger than linear... Quadratic != exponential

lelouch
Автор

In the early 2000s, here in Russia, Mamba was a very popular dating site. Good to hear they are now at the frontier of AI development!!!

awesomebearaudiobooks
Автор

didn’t expect lobotomy kaisen to make its way to the LLM and AI space😭😭 best thing ever

JoshuaEworo
Автор

I love seeing this trend of LLMs getting quicker and using less resources. I think we are only a few breakthroughs away from a point where LLMs can begin running on mobile devices locally at reasonable speeds. Right now companies are spending major resources on making the models smarter through the models themselves. However, make the model small and quick enough, and you could run it multiple times, prompted by hard-coded logic, to possibly accomplish the same things as the larger models without the need for as much power or space (at the cost of time). This could allow an LLM to exist on robots without being connected to a service. The technology is in the works for quick instruction following for robots, so an LLM being able to feed the robot instructions makes the robot self guiding, which would be a sight.

Spyblox
Автор

"Stand proud Transformer, you are strong."
- Mamba

wlockuz
Автор

throughout youtube clickbait and interesting facts you are the honored one

lordm
Автор

naaah lobotomy kaisen is taking over everything i swear 💀💀💀💀

nawabifaissal
Автор

it scales quadratically not exponentially

xthesayuri
Автор

this is crazy, everytime i click i think im going to watch a fireship vid

razieren
Автор

The Gojo reference made me shout out loud like a little fangirl :')

ambinintsoahasina
Автор

The last thing I expected was a jjk edit

avizi_
Автор

awesome breakdown. When the other AI hype channels asked bycloud if he could go head to head with their surface level analysis, bycloud responded "Nah, I'd win" (DEEPFRIED BASS)

Artorias
Автор

Ngl the thumbnail tricked me, thought it was a fireship video, but it worked lol and I’m still watching.

dualasus
Автор

i have some hope for byte mamba but the architecture has drawbacks and seems more like an intermediary step before something greater that builds on it

JazevoAudiosurf
Автор

I was not expecting to get a lobotomy while watching an LLM news video today...

david_n_nettey
welcome to shbcf.ru