Stemming and Lemmatization: NLP Tutorial For Beginners - S1 E10

preview_player
Показать описание
Stemming and lemmatization are two popular techniques to reduce a given word to its base word. Stemming uses a fixed set of rules to remove suffixes, and prefixes whereas lemmatization use language knowledge to come up with a correct base word. Stemming will be demonstrated in ntlk (spacy doesn't support stemming) whereas code for lemmatization is written in spacy

🔖Hashtags🔖

#nlp #nlptutorial #nlppython #spacytutorial #spacytutorialnlp #nlptutorialpython #naturallanguageprocessingstemming #nlpstemming #nlpstemmingtutorial #stemming #lemmatization

#️⃣ Social Media #️⃣

❗❗ DISCLAIMER: All opinions expressed in this video are of my own and not that of my employers'.
Рекомендации по теме
Комментарии
Автор

I love the way you explain - other NLP concepts - customizing the pipeline for example !!!

Breaking_Bold
Автор

Stemming (removing something) vs Lemmatization ( mapped with base word) 4:50
Note : Spacy don't have support of stemming .

Code : stemming

import nltk
import spacy
from nltk.stem import PorterStemmer
stemmer = PorterStemmer()
words = ["eating", "eats", "eat", "ate", "adjustable", "rafting", "ability", "meeting"]
for word in words:
print(word, "|", stemmer.stem(word))



Code : lemmatization
nlp = spacy.load("en_core_web_sm")
doc = nlp("eating eats eat ate adjustable rafting ability meeting better")
for token in doc:
print(token, "|", token.lemma_, "|", token.lemma)



Custom lemmatization

Code :

ar =
ar.add([[{"TEXT":"Bro"}], [{"TEXT":"Brah"}]], {"LEMMA":"Brother"})
doc =nlp("Bro, you wanna go ? Brah, don't say no ! I am exhausted")
for token in doc:
print(token.text, "|", token.lemma_)

ayushgupta
Автор

Very helpful! Looking forward to the rest of the series! Thank you!

amandaahringer
Автор

There is a quiz now!! thank your for your awsome work♥♥♥

pphantom
Автор

Fantastic ...you make complex NLP topics simple. !!!

Breaking_Bold
Автор

you are my teacher and i am proud of you

belfloretkoriciza
Автор

This is some quality content.
Thank you!

aintgonhappen
Автор

8:36 I noticed that the prebuilt language pipelines return an unexpected lemma for "ate". I assumed that lg and trf pipelines would produce ate -> eat while the sm and md pipelines would produce ate -> ate, but that doesn't seem to be the case.

def eat_lemma(lang_pipeline):
nlp = spacy.load(lang_pipeline)
doc = nlp("ate")
print(lang_pipeline, '|', doc[0].lemma_)

lp = ["en_core_web_sm", "en_core_web_md", "en_core_web_lg", "en_core_web_trf"]
for lang_pipeline in lp:
eat_lemma(lang_pipeline)

en_core_web_sm | ['eat']
en_core_web_md | ['ate']
en_core_web_lg | ['eat']
en_core_web_trf | ['ate']

Update: I see that when "ate" is used in the context of a sentence each pipeline produces a lemma of "eat".

doc = nlp("The person ate an apple.")
en_core_web_sm | ['the', 'person', 'eat', 'an', 'apple', '.']
en_core_web_md | ['the', 'person', 'eat', 'an', 'apple', '.']
en_core_web_lg | ['the', 'person', 'eat', 'an', 'apple', '.']
en_core_web_trf | ['the', 'person', 'eat', 'an', 'apple', '.']

amandaahringer
Автор

If possible try to come with live sessions it would be helpful

aashishmalhotra
Автор

Sir it will be very helpful if you make a NLP project like a Chatbot at the end of the series and thanks for making this series

raphayzia
Автор

Sir will you please share ppts also, that will help in clearing the concepts

rohanthite
Автор

Hey Guys when we used stemming and lemmatizing before training the data we just change the words. After training the model model could generate words that are different from lemmatized words. I mean we teach the model `eat` however the model learn also `ate` how?

berkayates
Автор

hello sir, if i want to stem and lemmatize my string at the same time, how'd i do that? as spacy doesn't allow stemming. and nltk doesn't allow lemmatization. pls answer asap

zaytech
Автор

Hey!
Firstly, this is a very good series. But for the exercise, in the last part using lemmatization, some of my words such as cooking were converted into cook and playing to play while running stayed as it is. Do you know what could be the issue?
Or do you have any explanation to this?

Thank you.

JayShah-mv
Автор

Hi sir a request for you to make some videos on python

anaschoudhari
Автор

I could not unable to install Ai4bharat package in PC.

Is there solution. For that error

firdospathan
Автор

Which one are you? Marc Spector or Steven Grant??

GAURAVRAUL
Автор

Sir last 1year EGO my pc hacked .gujd ransomwer please huw to get back my data 🙏 help mee please sum important data is ther

Telugu-Tech-suport
Автор

Hey, aren't you the moon knight?

leoxu