GPT (nanoGPT) from a beginner’s perspective (Part 2 Final)

preview_player
Показать описание
In this video I went through Karpathy's nanoGPT Code base, explaining the final part - Self-Attention.

nanoGPT is a character level implementation of a GPT (Generative Pre-trained Transformer) referencing the official Transformer from the Attention is All You Need paper.

Karpathy is my role model in the field of AI Research, he is a cofounder of Open AI and former Director of AI at Tesla.

My GPT Practice Repo

References

#gpt #nanogpt #karpthy #ai #nlp #llm #openai #google
Рекомендации по теме
Комментарии
Автор

For optimizing computation: during an inference shouldn't we abandon triangular mask, if only we want to predict one-last token? It seems like an unnecessary computation if in none-training mode we take only the last one tokens logits which predicts next token after given input sequence.

madragonse
Автор

Where can I find your, gpt_dev.ipynb?

rpraver