Lesson 8 - Deep Learning for Coders (2020)

Показать описание

We finish this course with a full lesson on natural language processing (NLP). Modern NLP depends heavily on *self-supervised learning*, and in particular the use of *language models*.

Pretrained language models are fine-tuned, in order to benefit from transfer learning. Unlike computer vision, fine-tuning in NLP can take advantage of an extra step, which is the use of self-supervised learning on the target dataset.

Before we can do any modeling with text data, we first have to tokenize and numericalize it. There are a number of approaches to tokenization, and which you choose will depend on your language and dataset.

NLP models use the same basic approach of *entity embedding* that we've seen before, except that for text data it's called `word embedding`. The method, however, is nearly identical.

NLP models have to handle documents of varying sizes, so they require a somewhat different architecture, such as a *recurrent neural network* (RNN). It turns out that an RNN is basically just a regular deep net, which has been refactored using a loop.

However, simple RNNs suffer from exploding gradients, so we have to use methods such as the LSTM cell to avoid this problem.

Finally, we look at some tricks to improve the results of our NLP models, such as additional regularization approaches, including various types of *dropout*, and activation regularization, as well as looking at weight tying.

Рекомендации по теме

Комментарии

00:00:00 - Intro and NLP Review
00:01:31 - Language models for NLP
00:04:36 - Review of text classifier in Lesson 1
00:05:08 - Improving results with a domain-specific language model
00:05:58 - Language model from scratch
00:10:27 - Tokenisation
00:12:19 - Word tokeniser
00:17:38 - Subword tokeniser
00:21:21 - Question: how can we determine if pre-trained model is suitable for downstream task?
00:23:25 - Numericalization
00:25:43 - Creating batches for language model
00:29:24 - LMDataLoader
00:31:07 - Creating language model data with DataBlock
00:33:23 - Fine-tuning a language model
00:35:07 - Saving and loading models
00:36:44 - Question: Can language models learn meaning?
00:37:56 - Text generation with language model
00:39:51 - Creating classification model
00:41:04 - Question: Is stemming and lemmatisation still used in practice?
00:42:21 - Handling different sequence lengths
00:45:30 - Fine-tuning classifier
00:48:54 - Questions
00:51:52 - Ethics and risks associated with text generation language models
00:56:22 - Language model from scratch
00:56:52 - Question: are there model interpretability tools for language models?
00:58:11 - Preparing the dataset for RNN: tokenisation and numericalization
01:03:35 - Defining a simple language model
01:04:49 - Question: can you speed up fine-tuning the NLP model?
01:05:44 - Simple language model continued
01:14:41 - Recurrent neural networks (RNN)
01:18:39 - Improving our RNN
01:19:41 - Back propagation through time
01:22:19 - Ordered sequences and callbacks
01:25:00 - Creating more signal for model
01:28:29 - Multilayer RNN
01:32:39 - Exploding and vanishing gradients
01:36:29 - LSTM
01:40:00 - Questions
01:42:23 - Regularisation using Dropout
01:47:16 - AR and TAR regularisation
01:49:09 - Weight tying
01:51:00 - TextLearner
01:52:48 - Conclusion

lextmb

Big thanks to Jeremy, Rachel, Sylvain, and Alexis for creating such a well-made book/video series that introduces deep learning to those with a coding background! You all make the content so interesting and straightforward; I'm excited to learn more!

davidbyron

To all FastAI team, thank you so much for creating such quality material. I love how practical the course approach of "playing the game first". I will adopt this to my future learning

Bestietvcute

Thanks Jeremy, Rachel, Sylvain, and Alexis. Brilliantly put together.

snowwhitei

I cant believe i actually made it to the end! It took me a year but i made it! Thanks so much fastai team. Great course

jmac

amazing lesson :) perfect introduction and dive into nlp, i wanted to get started with it for a while.

johanneslaute

Love the content! When will the second half of the course be held?

daviddeng

Awesome stuff, thanks Jeremy! All in all this course is amazing but bear in mind that it takes more self-discipline and engagement compared to Coursera's course.
I'd also say that Coursera is more suited to high-school kids and students as well. But for anybody switching careers, and who is already experienced, I'll be recommending this course from now on. Great teaching methodology - resonated perfectly with how I learn. Lots of examples, contextual and top-down.

TheAIEpiphany

Amazing course. Thanks for all the thoughtful content and looking forward to part 2!

SP-dszw

Mistake regarding 41:00 - stemming is not something that removes the stem, it's a technique to reduce a word to it's more primordial form, removing suffixes and prefixes.

mindasb

Will there be a 2020 version of the image segmentation lesson this year?

cullenharris

Thanks for this great course! What would be covered in Part 2?

LiangyueLi

correction (11:00) in Polish generally the words are not glued together into one,
at least not to the same extreme like in Dutch or German,
please compare:

English: electricity production company
Polish: firma produkująca energię elektryczną
vs.
Dutch:
German: Stromerzeugungsunternehmen

I dont know about Turkish, but from what I googled it also seems a wrong example

adrianstaniec

The language model has seen and consumed the text of the test set (and also the validation set) during training. Can that be a factor affecting the accuracy of the movie review classifier later on? Shouldn't the language model be trained only on the text of the train split?

bikashg

A good course to learn the basics of ML.🍠🌍🌝🎰🥇

dsm

I don't really think you've beaten what you got in lesson one, validation loss being less than training loss basically means that your validation set and training set are either too similar to each other or not distributed in similar way, that's a data leak which result in overfitting and this model in production would 100% perform worse

michapodlaszuk

Lesson 8 - Deep Learning for Coders (2020)

Lesson 8 - Practical Deep Learning for Coders 2022

Lesson 8 - Deep Learning for Coders (2020)

Lesson 8 (2019) - Deep Learning from the Foundations

Lesson 8: Cutting Edge Deep Learning for Coders

Machine Learning 1: Lesson 8

Neural Network Architecture in Machine Learning | Lesson 8

Deep Learning for Coders | Full Course | lesson 8

DL with Python: Introduction to deep learning for computer vision (Chapter 8)

Hackathon 1: Series 1 - Use Case 8 | Uses and benefits of AI in Excel | CA Kedar Pande

ZappyAI Academy: Deep Learning with PyTorch, Lesson 8 - Recurrent Neural Networks (RNNs)

Deep Learning With PyTorch Bookclub/Tutorial Chapter 8 - Using convolutions to generalize

Lesson 8: Learn how to solve your first Deep Learning task

Deep Learning With Pytorch Lesson 8

NLP - Deep Learning tutorial for beginners using FastAI 2020 [Urdu ] - Lesson 8

LESSON 8: DEEP LEARNING MATHEMATICS: Comparing the Kinds of Matrices

Introduction to Deep Learning - Lesson 8

Lesson 2: Deep Learning 2018

#8 Machine Learning Specialization [Course 1, Week 1, Lesson 2]

But what is a neural network? | Chapter 1, Deep learning

TWiML x Fast.ai v3 Deep Learning Part 2 Study Group - Lesson 8 - Spring 2019 1080p

Deep Learning with Keras for novices at python(Lesson8)-Build CNN

Hands on Machine Learning - Chapter 8 - Dimensionality Reduction

AI and Machine Learning - Lesson 8 Part 1

Lesson 8: Deep Learning Part 2 2018 - Single object detection