NER With Transformers and spaCy (Python)

preview_player
Показать описание
Named entity recognition (NER) consists of extracting 'entities' from text - what we mean by that is given the sentence:

"Apple reached an all-time high stock price of 143 dollars this January."

We might want to extract the key pieces of information - or 'entities' - and categorize each of those entities. Like so:

- Apple  : Organization
- 143 dollars :  Monetary Value
- this January :  Date

For us humans, this is easy. But how can we teach a machine to distinguish between a granny smith apple and the Apple we trade on NASDAQ?

(No, we can't rely on the 'A' being capitalized…)

This is where NER comes in - using NER, we can extract keywords like apple and identify that it is, in fact, an organization - not a fruit.

The go-to library for NER is spaCy, which is incredible. But what if we added transformers to spaCy? Even better - we'll cover exactly that in this video.

🤖 70% Discount on the NLP With Transformers in Python course:
Рекомендации по теме
Комментарии
Автор

New subscriber here! really enjoying your videos, , especially on NLP. Would love to see more NLP stuff with PyTorch, really enjoyed your sentence similarity demo using BERT, more use cases of BERT or other models would be great! Keep up your awesome work, thanks!

wokechildmob
Автор

hello, could you suggest me some patterns to use with pos-tag on a textual dataset in the financial domain for my thesis work?

Genesisify
Автор

Great video! For some specific use-cases, ut'd be interesting to train custom NER models with SpaCy / Transformers. Have you tried that?

AhmedBesbes
Автор

I have a large amount of data which is scraped from internet available for NER training. I used spacy NER pipeline + bert model for fine tuning but i am getting a warning like "oken indices sequence length is longer than the specified maximum sequence length for this model (639 > 512). Running this sequence through the model will result in indexing errors" ... Also i am getting a large amount of garbage data when viewed in displacy. I havnt remove new line character an all from the data. Is this can be a reason for obtaining the garbage data.

aniiyain
Автор

spacy can't identify "alabama" as an entity. do you think transformer can do better?

saw
Автор

Good Video! Could you please upload a video on NER with SpaCy and Transformers (HuggingFace) together?

urshema
Автор

Harris Nancy Lopez Christopher Williams Thomas

FerminaHaddenProakzmia
Автор

Thomas Amy Jackson Ronald Walker William

FerminaHaddenProakzmia