Transformer and BERT Pre-training

Показать описание

In this lecture we look at the Transformer architecture for sequence contextualization, how the BERT model pre-trains Transformers, and how with the HuggingFace ecosystem the community can share models. We also overview an exemplary task for BERT: Extractive Question Answering (QA).

📖 Check out Youtube's CC - we added our high quality (human corrected) transcripts here as well.

Slide Timestamps:
0:00:00 1 - Welcome
0:00:15 2 - Today
0:01:01 3 - Another Versatile Building Block
0:02:45 4 - Transformer
0:02:55 5 - Contextualization via Self-Attention
0:05:03 6 - Transformer
0:06:33 7 - Transformer – Architecture
0:10:53 8 - Self-Attention Definition
0:12:28 9 - Transformer in PyTorch
0:13:52 10 - Transformer – Positional Encoding
0:14:52 11 - Transformer - Variations
0:16:21 12 - In-Depth Resources for Transformers
0:16:55 13 - Pre-Training
0:17:20 14 - Pre-Training Motivation
0:18:25 15 - Masked Language Modelling
0:19:36 16 - Masked Language Modelling
0:20:58 17 - BERT
0:26:13 18 - BERT - Input
0:27:27 19 - BERT - Model
0:30:00 20 - BERT - Workflow
0:31:24 21 - BERT++
0:32:48 22 - Pre-Training Ecosystem
0:34:21 23 - HuggingFace: Transformers Library
0:35:43 24 - HuggingFace: Model Hub
0:36:41 25 - HuggingFace: Model Hub
0:37:15 26 - HuggingFace: Getting Started
0:38:06 27 - Extractive QA
0:38:23 28 - Soooo many tasks are solvable with BERT
0:39:24 29 - Extractive Question Answering
0:41:04 30 - Extractive QA: Datasets
0:42:12 31 - Extractive QA: Training
0:44:27 32 - IR + QA = Open Domain QA
0:46:08 33 - Summary: Transformers & BERT
0:46:58 34 - Thank You

Sebastian Hofstätter

Рекомендации по теме

Transformer and BERT Pre-training

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Transformers, explained: Understand the model behind GPT, BERT, and T5

Transformer models and BERT model: Overview

BERT Neural Network - EXPLAINED!

Pre-training of BERT-based Transformer architectures explained – language and vision!

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (AI Paper Summary)

NLP Demystified 15: Transformers From Scratch + Pre-training and Transfer Learning With BERT/GPT

Bert: Pre-training of Deep bidirectional Transformers for Language Understanding

[MAI554 - Lecture 1] Introduction to Deep Learning for Language Modeling

What are Transformers (Machine Learning Model)?

BERT explained: Training, Inference, BERT vs GPT/LLamA, Fine tuning, [CLS] token

Tutorial 1-Transformer And Bert Implementation With Huggingface

BERT Transformer: Pretraining and Fine Tuning

BERT Networks in 60 seconds

L19.5.2.3 BERT: Bidirectional Encoder Representations from Transformers

BERT for pretraining Transformers

Pre-Train BERT from scratch: Solution for Company Domain Knowledge Data | PyTorch (SBERT 51)

L19.5.2.1 Some Popular Transformer Models: BERT, GPT, and BART -- Overview

P209 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Transformer and BERT Pre-training

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

RoBERTa: A Robustly Optimized BERT Pretraining Approach

BERT Explained!

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Paper Explained)