Mamba Might Just Make LLMs 1000x Cheaper...

Показать описание

Would mamba bring a revolution to LLMs and challenge the status quo? Or would it just be a cope that may not last in the long term? Looking at the trajectories right now, we might not need transformers if mamba can actually scale but attention is probably still here to stay.

Special thanks to
- Gifted Gummy Bee
for helping with this video!

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Transformer: Attention Is All You Need

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Efficiently Modeling Long Sequences with Structured State Spaces

Flash Attention

Flash Attention 2

VMamba: Visual State Space Model

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

MambaByte: Token-free Selective State Space Model

Repeat After Me: Transformers are Better than State Space Models at Copying

This video is supported by the kind Patrons & YouTube Members:
🙏Andrew Lescelius, alex j, Chris LeDoux, Alex Maurice, Miguilim, Deagan, FiFaŁ, Daddy Wen, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richárd Nagyfi, Timo Steiner, Henrik G Sundt, projectAnthony, Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä, SO, Richárd Nagyfi

[Music] massobeats - midnight
[Video Editor] @askejm, Lunie

Рекомендации по теме

Комментарии

Bro dropped the hardest LLM anime edit and thought we wouldnt notice

xthesayuri

7:20 the greatest LLM anime of all time begins(JJK is within 2 letters of LLM)

😂 I did not expect The JJK EDIT and died laughing

sleepingArtist

Good so I guess OpenAI no longer needs 7 Trillion dollars for chip factories.

itzhexen

That anime edit was one of the sickest media pieces I've seen, but unfortunatelly I have no friends in the intersection of jujutsu enjoyers and ai reaserch conisseurs, who would appreciate it wholly

OxidoPEZON

"Exponentially" should stop being misused for everything that is bigger than linear... Quadratic != exponential

lelouch

In the early 2000s, here in Russia, Mamba was a very popular dating site. Good to hear they are now at the frontier of AI development!!!

awesomebearaudiobooks

didn’t expect lobotomy kaisen to make its way to the LLM and AI space😭😭 best thing ever

JoshuaEworo

I love seeing this trend of LLMs getting quicker and using less resources. I think we are only a few breakthroughs away from a point where LLMs can begin running on mobile devices locally at reasonable speeds. Right now companies are spending major resources on making the models smarter through the models themselves. However, make the model small and quick enough, and you could run it multiple times, prompted by hard-coded logic, to possibly accomplish the same things as the larger models without the need for as much power or space (at the cost of time). This could allow an LLM to exist on robots without being connected to a service. The technology is in the works for quick instruction following for robots, so an LLM being able to feed the robot instructions makes the robot self guiding, which would be a sight.

Spyblox

"Stand proud Transformer, you are strong."
- Mamba

wlockuz

throughout youtube clickbait and interesting facts you are the honored one

lordm

naaah lobotomy kaisen is taking over everything i swear 💀💀💀💀

nawabifaissal

it scales quadratically not exponentially

xthesayuri

this is crazy, everytime i click i think im going to watch a fireship vid

razieren

The Gojo reference made me shout out loud like a little fangirl :')

ambinintsoahasina

The last thing I expected was a jjk edit

avizi_

awesome breakdown. When the other AI hype channels asked bycloud if he could go head to head with their surface level analysis, bycloud responded "Nah, I'd win" (DEEPFRIED BASS)

Artorias

Ngl the thumbnail tricked me, thought it was a fireship video, but it worked lol and I’m still watching.

dualasus

i have some hope for byte mamba but the architecture has drawbacks and seems more like an intermediary step before something greater that builds on it

JazevoAudiosurf

I was not expecting to get a lobotomy while watching an LLM news video today...

david_n_nettey

Mamba Might Just Make LLMs 1000x Cheaper...

Mamba Might Just Make LLMs 1000x Cheaper...

The Largest Mamba LLM Experiment Just Dropped

Mamba Language Model Simplified In JUST 5 MINUTES!

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)

MAMBA from Scratch: Neural Nets Better and Faster than Transformers

Can we reach AGI with just LLMs?

MAMBA LLM for Personalized Medicine?

MAMBA and State Space Models explained | SSM explained

Ditch the Tokens, Hello MambaByte LLM !!!

All You Need To Know About Running LLMs Locally

Mamba vs. Transformers: The Future of LLMs? | Paper Overview & Google Colab Code & Mamba Cha...

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

This new AI is powerful and uncensored… Let’s run it

Is Mamba Destroying Transformers For Good? 😱 Language Models in AI

Do we Need the Mamba Mindset when LLMs Fail? MoE Mamba and SSMs

🐍 Mamba2 8B Hybrid 🚀: NVIDIA Stealth drops their latest Mamba2 Model!

Falcon-Mamba 7B State Space Model Beat Llama 3.1 And All Same Size LLM?

Mapping GPT revealed something strange...

Mamba, SSMs & S4s Explained in 16 Minutes

Mamba - a replacement for Transformers?

Mamba: The LLM That Does Not Use Transformers

Webinar on Mamba vs Transformer

nVidia HYMBA-1.5B: Supercharging AI Agents with Hybrid LLMs

MAMBA AI (S6): Better than Transformers?