filmov
tv
Lesson 8 - Deep Learning for Coders (2020)

Показать описание
We finish this course with a full lesson on natural language processing (NLP). Modern NLP depends heavily on *self-supervised learning*, and in particular the use of *language models*.
Pretrained language models are fine-tuned, in order to benefit from transfer learning. Unlike computer vision, fine-tuning in NLP can take advantage of an extra step, which is the use of self-supervised learning on the target dataset.
Before we can do any modeling with text data, we first have to tokenize and numericalize it. There are a number of approaches to tokenization, and which you choose will depend on your language and dataset.
NLP models use the same basic approach of *entity embedding* that we've seen before, except that for text data it's called `word embedding`. The method, however, is nearly identical.
NLP models have to handle documents of varying sizes, so they require a somewhat different architecture, such as a *recurrent neural network* (RNN). It turns out that an RNN is basically just a regular deep net, which has been refactored using a loop.
However, simple RNNs suffer from exploding gradients, so we have to use methods such as the LSTM cell to avoid this problem.
Finally, we look at some tricks to improve the results of our NLP models, such as additional regularization approaches, including various types of *dropout*, and activation regularization, as well as looking at weight tying.
Комментарии