Large Language Models in Five Formulas

Показать описание

Tutorial on building intuition about LLMs.

00:00 - Intro
02:15 - 1: Generation (Perplexity)
15:40 - 2: Memory (Attention)
28:00 - 3: Efficiency (GEMM)
38:40 - 4: Scaling (Chinchilla)
46:37 - 5: Reasoning (RASP)
55:33 - Conclusion

Developed for an invited tutorial at the Harvard Data Science Initiative.

Note: This tutorial is rather high-level and leaves out much of the scientific and citation history. There are other great guides that provide this in detail. My goal here was chalk-board level intuition.

Sasha Rush 🤗

Рекомендации по теме

Комментарии

This appears to be a distillation of the most important concepts in large language models today. Thanks for the exposition.

nintishia

Extremely high entropy video. Amazing clarity, delivery, content, and follow. Pure genius!

muhannadobeidat

This is a great modern supplement to Karpathy's guide to language models! Thanks Sasha! Just subbed

DistortedV

I found this to be an incredibly unique and interesting approach to explaining LLMs, an excellent introduction, thank you so much for the video!

sarthak-ti

Thank you for making this video so interesting with those nice graphics and examples. I need to sit down and watch it attentively.

sheikhakbar

Excellent presentation! Easy to follow and tons of great material including the links to the slides

joedigiovanni

Knowledge/sec in this video is off the chart, and the info is cutting edge!

icriou

Amazing content, thanks for putting this together!

donatocapitella

Thanks a lot Prof. Rush for this material.

syedmostofamonsur

For someone like me who is new to this field and wants to understand the nitty-gritty of language models, it's necessary to see each part separately, understand it first, and then move on to the next part. But still, I can sense how fantastically it is explained to those who have the basic understanding of deep learning.

arkaprovobhattacharjee

Thanks for the video good high level overview. I like the excalidraw slides also

ItzGanked

Hey Sasha, What tools do you use to make your presentations? It's so different from the typical academic presentations :)

shubhamtoshniwal

This is very insightful. Thanks for posting!

pebre

this was a wonderful video thanks so much for this

ChinaTalkMedia

Great complement to Karphathy's video

FabienFabienB

Thanks for this awesome explanation! Can someone explain one point to me? The issue with argmax at 22:15 is that it has no derivative, so neural network parameters cannot be trained using it. If I understand correctly, the argmax is the word which should be "attended" when predicting the next word (park). Why is argmax the desired function here - what if the prediction of the next word depends on not the most important single word, but the most important two words in the context? Considering this case, doesn't softmax have an additional benefit over the "naive" argmax that it can also compute distributions with more than one mode?

benjaminsteenhoek

was narration generated? I would love to use the same technique for narrating text.

AllNightLearner

At 32:41, isn't each element in AB rows in A multiplying with columns in B? Waiting for your answer.

ZylinTeo

Well every output must be mathematical proven ingest so can we not build a formula for every pattern of output. Let's say it out human sense n grammar sense of each word constructed. While it construct can it not out how it did it

martiancoders

WOOHOO! just found this channel. it is almost better than porn. how do we give you our money so you keep making videos? pls tell us :o

Tubernameu

Large Language Models in Five Formulas

How large language models work, a visual intro to transformers | Chapter 5, Deep Learning

Large Language Models in Five Formulas

How Large Language Models Work

Large Language Models from scratch

5 problems when using a Large Language Model

Topic 5 What are Large Language Models in Artificial Intelligence?

Introduction to large language models

[1hr Talk] Intro to Large Language Models

What are Large Language Models (LLMs)?

5 Best Free Large Language Models (LLMs) in 2024

simpleshow explains: Generative AI, Large Language Models and ChatGPT

Large Language Models: Aufbau, Anwendung & Nutzen

What is an LLM (Large Language Model)?

Introduction to Large Language Models

Can Large Language Models Understand Meaning?

A Practical Introduction to Large Language Models (LLMs)

Create a Large Language Model from Scratch with Python – Tutorial

Should You Use Open Source Large Language Models?

Risks of Large Language Models (LLM)

Top 12 Best Large Language Models (LLMs) in 2023 | Generative AI | GPT4 ChatGPT | Google Bard LLaMA

Large Language Models Are Zero Shot Reasoners

Why Large Language Models Hallucinate

Five Core Ingredients of Large Language Models (LLMs)

Graph Language Models EXPLAINED in 5 Minutes! [Author explanation 🔴 at ACL 2024]