Bag of Words

preview_player
Показать описание
Analyzing and quantifying unstructured data, such as text, is the core of natural language processing. In this short video, director of data science, Max Margenot explains how to preprocess a text document using tokenization and stemming to create a bag of words for use in whatever sort of model you want, including sentiment models.

Disclaimer
Quantopian provides this presentation to help people write trading algorithms - it is not intended to provide investment advice.

More specifically, the material is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory or other services by Quantopian.

In addition, the content neither constitutes investment advice nor offers any opinion with respect to the suitability of any security or any specific investment. Quantopian makes no guarantees as to accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.
Рекомендации по теме
Комментарии
Автор

Finally, i got the idea of BOW. Watched many video tutorials, read papers, drugged in tons of docs. I thought i could never understand what it really is. Your video made my day, thank you for the simpliest explanation!

mokhiyakhonuzokova
Автор

This 3 minute video was so awesome, even animations made with brilliant!
Hats off to you man!

devanshmesson
Автор

Thanks for your video. I'm doing data labelling so i got an idea now how it works on the machine learning.

irazira
Автор

If I understand it right. It counted all the times the word appeared and represent the times in the sentence by appearing times vector?

竹子君冲呀