Tokenization in Spacy: NLP Tutorial For Beginners - S1 E8

preview_player
Показать описание
Word and sentence tokenization can be done easily using the spacy library in python. In this NLP tutorial, we will cover tokenization and a few related topics.

⭐️ Timestamps ⭐️
00:00 What is tokenization
02:35 Install spacy
02:49 Coding starts
03:23 Basic English word tokenization
14:15 Span object
15:00 Token attributes
18:40 Grab emails from the student information doc
23:58 Tokenization in Hindi
26:13 Customize tokenization rule
29:52 Sentence tokenization (or segmentation)
33:15 Exercise

Exercise: In the above code, go to the end and you will find exercises

🔖Hashtags🔖

#nlp #nlptutorial #nlppython #spacytutorial #spacytutorialnlp #spacytutorialnlp #wordtokenization #tokenizerspacy #tokenizationnlp #wordtokenizerspacy #tokenizationandspacy #spacynlp

#️⃣ Social Media #️⃣

❗❗ DISCLAIMER: All opinions expressed in this video are of my own and not that of my employers'.
Рекомендации по теме
Комментарии
Автор

I want to write this every time when I go through your YouTube videos (earlier Deep Leaning and now NLP)....

You are an outstanding educator. Your practice of illustrating complex concepts with pertinent use cases adds an engaging dimension to the learning experience.

Your proficiency in simplifying intricate ideas with clarity is truly remarkable. Your sense of timing in presenting crucial details is impeccable, and your suggested reading resources are exceptionally valuable.

Thank you for putting your efforts in creating such useful leaning material.

ajaythapar
Автор

Good intro into NLP concepts, Dhawal. Btw, as someone who has worked on a large scale NLP projects here in Toronto, I can vouch that FirstLanguage NLP APIs are right up there with one of the biggest cloud service providers' speech SDK - and at a fraction of cost! And the co-founder is a PhD specializing in NLP herself.

saarthaksangamnerkar
Автор

Pretty much loved it all on a watching spree 8th lesson in 24hours :)

gautamnayak
Автор

SPACY makes NPL implementation easy just like the way CODEBASICS making NLP learning easy.

aakuthotaharibabu
Автор

00:02 Tokenization is the process of splitting text into meaningful segments.
02:21 Tokenization in Spacy
07:22 Spacy's tokenization splits currency and punctuation into separate tokens
09:39 Tokenization in Spacy involves splitting text into separate tokens based on prefixes, suffixes, and exceptions.
14:48 Tokenization in Spacy allows for the identification and classification of different attributes of tokens.
17:19 Tokenization in Spacy
21:52 The main point of the given subpart is to explore the attributes of spaCy tokens.
24:17 Tokenization in Spacy allows you to type in different languages even with an English keyboard
29:02 Tokenization in Spacy allows splitting the text into segments
31:13 Tokenization is an essential part of the spaCy pipeline.
35:20 Tokenization in Spacy

SurajIntelligentBrains
Автор

i thats because, Dr. Strange has space inbetween. when its removed. the Dr.Strange is together in one sentence. Thanks for the videos!

vigneshpadmanabhan
Автор

Kindly make some videos on how to vectorize source code for training DL model

santoshsaklani
Автор

In my code like_email is giving empty list

kirtipant
Автор

I made it here.

Lets see how far i can go

jesuyanmifeegbewale
Автор

My spacy is tokenizing words like #hello to # and hello, I want to prevent that. Is there something I can do?

PrabinKumarDas
Автор

you always say 'this technique will be covered later' but you don't explain those.

geetharajamanickam
Автор

Excercise 1 Solution :
for token in doc:
if(token.like_url):
print(token)

ramandeepbains
Автор

Really enjoying this playlist, and I've reached the 8th tutorial already just in 1 day. Thank you for making it interesting!

dikshyakasaju
Автор

Hi sir, can you please share your views about data analyst jobs in government bodies in india, the pros and cons of that.

datayogi_
Автор

Do you want to learn technology from me? codebasics.io is my website for video courses. First course going live in the last week of May, 2022

codebasics
Автор

Sir is it possible to create voice recreation??

Please make video on it☺☺☺

shashankk
Автор

Thanks! Your videos are really helpful. You are making great job of explaining complex topics. Thanks once again

PrathmeshBodas
Автор

You make wonderful videos! 👏 I have a quick question: 🤷‍♂️ I have a set of words 🤷‍♂️. (behave today finger ski upon boy assault summer exhaust beauty stereo over). How do I use this? 🤨

CendrillonSympson
Автор

thank you so much for this great explanation and the exercies. great work!

pphantom
Автор

sir how r u getting recommendation of syntax while u typing the function ?

anirudhsom