filmov
tv
MAMBA from Scratch: Neural Nets Better and Faster than Transformers
![preview_player](https://i.ytimg.com/vi/N6Piou4oYx8/maxresdefault.jpg)
Показать описание
Mamba is a new neural network architecture that came out this year, and it performs better than transformers at language modelling! This is probably the most exciting development in AI since 2017. In this video I explain how to derive Mamba from the perspective of linear RNNs. And don't worry, there's no state space model theory needed!
#mamba
#deeplearning
#largelanguagemodels
00:00 Intro
01:33 Recurrent Neural Networks
05:24 Linear Recurrent Neural Networks
06:57 Parallelizing Linear RNNs
15:33 Vanishing and Exploding Gradients
19:08 Stable initialization
21:53 State Space Models
24:33 Mamba
25:26 The High Performance Memory Trick
27:35 The Mamba Drama
#mamba
#deeplearning
#largelanguagemodels
00:00 Intro
01:33 Recurrent Neural Networks
05:24 Linear Recurrent Neural Networks
06:57 Parallelizing Linear RNNs
15:33 Vanishing and Exploding Gradients
19:08 Stable initialization
21:53 State Space Models
24:33 Mamba
25:26 The High Performance Memory Trick
27:35 The Mamba Drama
MAMBA from Scratch: Neural Nets Better and Faster than Transformers
Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)
Mamba Might Just Make LLMs 1000x Cheaper...
Mamba, SSMs & S4s Explained in 16 Minutes
Introduction to Mamba SSM in PyTorch 🤖 🐍
MAMBA and State Space Models explained | SSM explained
Mamba - a replacement for Transformers?
Mamba and S4 Explained: Architecture, Parallel Scan, Kernel Fusion, Recurrent, Convolution, Math
Mamba Language Model Simplified In JUST 5 MINUTES!
The genius of Andrej Karpathy | John Carmack and Lex Fridman
State Space Models (SSMs) and Mamba
JAMBA MoE: Open Source MAMBA w/ Transformer: CODE
Deep dive into how Mamba works - Linear-Time Sequence Modeling with SSMs - Arxiv Dives
MAMBA AI (S6): Better than Transformers?
Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - 693
LSTM Top Mistake In Price Movement Predictions For Trading
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Vision Mamba BEATS Transformers!!!
Transformers: The best idea in AI | Andrej Karpathy and Lex Fridman
Mamba sequence model - part 1
Do we need Attention? A Mamba Primer
Create a Large Language Model from Scratch with Python – Tutorial
Super Simple Neural Network Explanation | Machine Learning Science Project
Mamba vs. Transformers: The Future of LLMs? | Paper Overview & Google Colab Code & Mamba Cha...
Комментарии