Transformer and BERT Pre-training

preview_player
Показать описание
In this lecture we look at the Transformer architecture for sequence contextualization, how the BERT model pre-trains Transformers, and how with the HuggingFace ecosystem the community can share models. We also overview an exemplary task for BERT: Extractive Question Answering (QA).

📖 Check out Youtube's CC - we added our high quality (human corrected) transcripts here as well.

Slide Timestamps:
0:00:00 1 - Welcome
0:00:15 2 - Today
0:01:01 3 - Another Versatile Building Block
0:02:45 4 - Transformer
0:02:55 5 - Contextualization via Self-Attention
0:05:03 6 - Transformer
0:06:33 7 - Transformer – Architecture
0:10:53 8 - Self-Attention Definition
0:12:28 9 - Transformer in PyTorch
0:13:52 10 - Transformer – Positional Encoding
0:14:52 11 - Transformer - Variations
0:16:21 12 - In-Depth Resources for Transformers
0:16:55 13 - Pre-Training
0:17:20 14 - Pre-Training Motivation
0:18:25 15 - Masked Language Modelling
0:19:36 16 - Masked Language Modelling
0:20:58 17 - BERT
0:26:13 18 - BERT - Input
0:27:27 19 - BERT - Model
0:30:00 20 - BERT - Workflow
0:31:24 21 - BERT++
0:32:48 22 - Pre-Training Ecosystem
0:34:21 23 - HuggingFace: Transformers Library
0:35:43 24 - HuggingFace: Model Hub
0:36:41 25 - HuggingFace: Model Hub
0:37:15 26 - HuggingFace: Getting Started
0:38:06 27 - Extractive QA
0:38:23 28 - Soooo many tasks are solvable with BERT
0:39:24 29 - Extractive Question Answering
0:41:04 30 - Extractive QA: Datasets
0:42:12 31 - Extractive QA: Training
0:44:27 32 - IR + QA = Open Domain QA
0:46:08 33 - Summary: Transformers & BERT
0:46:58 34 - Thank You
Рекомендации по теме