Topic Modeling with SVD & NMF (NLP video 2)

preview_player
Показать описание
In order to organize posts (from the newsgroups data set) by topic, we learn about 2 different matrix decompositions: singular value decomposition (SVD) and non-negative matrix factorization (NMF). Along the way, we learn about stop words, stemming, & lemmatization.

Рекомендации по теме
Комментарии
Автор

00:00 - Topic Modeling: Problem
01:40 - Topic Modeling: Motivation
02:43 - Getting Started
03:10 - Additional Resources
03:55 - Look at our data
07:14 - Stop words
10:21 - Stemming and Lemmatization
22:18 - Data Processing
24:04 - Singular Value Decomposition
45:44 - Non-negative Matrix Factorization
58:35 - TF-IDF
01:00:41 - Truncated SVD
01:03:23 - Timing comparison

PokeballmasterInc
Автор

When someone does their work enjoying it, the outcome always comes to be the best. Thank you Rachel :)

Sandy____
Автор

Rachel, the matrix multiplication part you discussed was a revelation! thanks....

pacmadman
Автор

Thanks, Rachel. I appreciate that you are sharing this for free.

eliegakuba
Автор

Good Video about SVD as well as you talked about Gilbert Strang which was even better.

Decomposition of a matrix into three matrices orthogonal columns, diagonal, and orthogonal rows matrices. Good one!!!

subashchandrapakhrin
Автор

Thank you, this is a really clear way of breaking down the topic and explaining the differences between SVD and NMF.

concert_music
Автор

Thank you for sharing this code first approach of NLP.

ruis
Автор

the important and interesting take away:


If your model is more complex and can handle more complexity (like NN) you probably don't remove stop words or use stemming or lemmatization because you are going to lose the information.

ml-simplified
Автор

Great explanation, thanks teacher Rachel.


Good thing that the microphone box is really padded and soft haha.

robertue
Автор

Can you please update your code that works well with libraries of today's version? e, g. spacy lemmatizer do not works in spacy v3. whereas spacy version 2 is no more available

NobodyIikesyouduh
Автор

What can we term eigen values practically ? does the highest eigen value mean it holds the best decription of the topic?

souvikdatta
Автор

This is the first real video in the series where you are diving in, and it seems you have done almost nothing explaining what you are doing. It almost seems like there was another class or something where you covered the topics and somehow all the youtube watchers here are missing some key info that was presented somewhere else.

zsternable
Автор

how does SVD knows how many topics it should classify??
can you please explain?

mustafasidhpuri
Автор

Hey Rachel,
What is the purpose of these being taught at the beginning of an NLP course?

lifeisarace
Автор

To be honest, I see relatively little direction in this video. I get that it's more targeted towards a class environment, but it's hard to follow as a watcher on Youtube.

starllama
Автор

%time U, s, Vh = linalg.svd(vectors, full_matrices=False)


after this command i am getting an error like memory error

DEBASHISDASGAU-C-
Автор

*from spacy.lemmatizer import Lemmatizer* gives the error: *ModuleNotFoundError: No module named* 'spacy.lemmatizer'. Do you know the reason of thi problem? Is there something I can do?

nicolamarcia
Автор

Too many distractions...the titles are misleading

seshganesh
Автор

Top-Down can't be an excuse for such a mess!!!
Deviate from the subject every 2 sentences and all those twitt

shaharrefaelshoshany
Автор

Hey Rachel,
First of all I'm a huge fan of Fast.ai initiative. You guys are doing a great job!!!

But I still don't understand the purpose of this video. You are not deep diving into any of the topics. Not explaining what, how, why of any topic. How would the hell would people know what is a word vectorizer. It's as if you had taken a detailed lecture in classroom and just revising them here in this video.

Sincerely I had lot of hopes on this channel but I'm utterly disappeared.

I'm now skeptical whether fast.ai is for real or just a another hype??

yerriswamym