Module 6- part3- Natural Language Processing (NLP) How to prepare text data correctly?

preview_player
Показать описание
Relevant playlists:

Instructor: Pedram Jahangiry

All of the slides and notebooks used in this series are available on my GitHub page, so you can follow along and experiment with the code on your own.

Lecture Outline:
0:00 Roadmap and recap.
1:20 Different kinds of sequence data and different RNN architectures
2:59 Human language vs Machine Language
4:45 What is NLP?
9:05 Preparing text data (TextVectorization)
10:56 Standardization
13:24 Tokenization (word level, N-gram, character level)
18:49 Indexing (Integer, OOV and masking)
21:01 Encoding (Multihot, one-hot, TF-IDF)
27:56 Word embedding
30:38 Bag-of-words vs sequential modeling approach
34:49 IMDB movie review classification example
35:45 Unigram with binary encoding
39:00 Bigram with binary encoding
40:13 Bigram with TF-IDF
41:42 Sequence modeling approach (one-hot encoding vs word embedding)
53:37 Pretrained word embedding (Word2Vec and GloVe)
Рекомендации по теме
join shbcf.ru