Coding the entire LLM Transformer Block

Показать описание

In this lecture, we code the entire Transformer block in Python based on it’s 5 components:

(1) Multi head attention
(2) Layer normalization
(3) Dropout layer
(4) Feedforward neural network with GELU activation
(5) Shortcut connections

We understand the theory, mathematical intuition and also do the coding for the entire implementation.

0:00 Transformer block visualised
3:56 5 components of the transformer block
16:28 Transformer block shape preservation
19:34 Let us jump into code!
21:14 Coding LayerNorm and FeedForward Neural Network class
25:40 Coding the transformer block class in Python
33:57 Transformer block code summary
35:12 Testing the transformer class using simple example
41:09 Lecture summary and next steps

=================================================

=================================================
Vizuara philosophy:

As we learn AI/ML/DL the material, we will share thoughts on what is actually useful in industry and what has become irrelevant. We will also share a lot of information on which subject contains open areas of research. Interested students can also start their research journey there.

Students who are confused or stuck in their ML journey, maybe courses and offline videos are not inspiring enough. What might inspire you is if you see someone else learning and implementing machine learning from scratch.

No cost. No hidden charges. Pure old school teaching and learning.

=================================================

🌟 Meet Our Team: 🌟

🎓 Dr. Raj Dandekar (MIT PhD, IIT Madras department topper)

🎓 Dr. Rajat Dandekar (Purdue PhD, IIT Madras department gold medalist)

🎓 Dr. Sreedath Panat (MIT PhD, IIT Madras department gold medalist)

🎓 Sahil Pocker (Machine Learning Engineer at Vizuara)

🎓 Abhijeet Singh (Software Developer at Vizuara, GSOC 24, SOB 23)

🎓 Sourav Jana (Software Developer at Vizuara)

Vizuara

Рекомендации по теме

Комментарии

thank you for your time and efforts explaining everything in details. The fear/doubts, about the LLMs, I had initially when started watching the lecture series is no more. For anyone seeking to comprehend LLMs thoroughly from beginning to end, this lecture series is unparalleled.

MrGirishbarhate

i can not think if a better playlist to understand llm from scratch, ,, amazing work

krunalpatel

Amazing you have made such a complex topic so easy to understand
These 3 months have been most useful to up skill 👍

tripchowdhry

yes. this style of lectures is awesome and unique!

helrod

It's simply amazing, How many more lectures will come in this series ?

ChenchuReddy

Dear Team, amazing.... how this playlist series is different from the LLM bootcamp?

HarshavardhanaSrinivasan-ce

want one more parallel series on bert from scratch

binnypero

I still don’t understand why u dont use PyTorch GELU why GELU approx

tripchowdhry

Coding the entire LLM Transformer Block

Coding the entire LLM Transformer Block

Let's build GPT: from scratch, in code, spelled out.

What are Transformers (Machine Learning Model)?

Illustrated Guide to Transformers Neural Network: A step by step explanation

Transformers, explained: Understand the model behind GPT, BERT, and T5

Attention is all you need (Transformer) - Model explanation (including math), Inference and Training

Pytorch Transformers from Scratch (Attention is all you need)

How to Build an LLM from Scratch | An Overview

Advanced Prompt Engineering | In-context learning | Chain of thought | Tree of thought

LLM Explained | What is LLM

Attention mechanism: Overview

Fine Tuning LLM Models – Generative AI Course

Transformer Neural Networks - EXPLAINED! (Attention is all you need)

common architecture of a large language #fm #genai #llm #transformers #gpt #sora #qwen

How ChatGPT Works Technically | ChatGPT Architecture

Text Generation with Transformers (GPT-2) In 10 Lines Of Code

BERT vs GPT

Getting Started With Hugging Face in 15 Minutes | Transformers, Pipeline, Tokenizer, Models

THIS is HARDEST MACHINE LEARNING model I've EVER coded

Introduction to large language models

First Neural Network with PyTorch in 60 seconds! #Shorts

Live -Transformers Indepth Architecture Understanding- Attention Is All You Need

Why Transformer over Recurrent Neural Networks

GPT2 LLM model simple text generation using huggingface transformer