How do LLMs work? Next Word Prediction with the Transformer Architecture Explained

preview_player
Показать описание


From the podcast episode with Jay Alammar: Building LLM Apps & the Challenges that come with it. The What's AI Podcast Episode 16: @arp_ai !

How to start in AI/ML - A Complete Guide:

Become a member of the YouTube community, support my work and get a cool Discord role :

#transformers #gpt #llm
Рекомендации по теме
Комментарии
Автор

So great you could make this interview ! I've been really inspired by the illustrated transformer guide in the past. Keep up the good work !

fulcrumthewhite
Автор

Thank you for this.
Wow! 90-100 encoders and decoders.

vincent_hall
Автор

How has this guy never seen Shawshank Redemption?!

seannews
Автор

Nice explanation. I understand the attention mechanism a bit better now. Thank you.
Makes me wonder, given enough compute, what would such a model be capable of with a trillion layers? The sum of all human knowledge would flop around like a bb in a boxcar, and yet I'd bet 2+2 would still = yes (occasionally).

thomasgoodwin