Text Mining Basics in Python

Показать описание

Welcome to another module of Practical Data Science course. In this module, we will cover the basics of text mining. After completing this module, you will be comfortable with anything possible in text mining. This module starts with the definition of text mining. After that, you will learn the process of text mining. Then we will focus on application of text mining. The advantages and challenges of text mining will be discussed after it. And finally, we will implement the basic concepts of text mining in python.

Text mining is the process of exploring and analyzing large amounts of unstructured text data with the help of software that can find concepts, patterns, topics, keywords, and other attributes in the data.

It's also called text analytics, though some think the two terms are different. In their view, text analytics is the application that sorts through data sets by using text mining techniques. Sometimes, you will hear people using ‘Text Data Mining’ or ‘Document Mining’ instead of text mining. No matter which name is used, they all refer to the same thing. And that is the process of exploring unstructured text data to discover useful information.
Text mining has become more useful for data scientists and other users since big data platforms and deep learning algorithms that can analyze large amounts of unstructured data have become available.

Mining and analyzing text help businesses find potentially valuable business insights in corporate documents, customer emails, call center logs, verbatim survey comments, social network posts, medical records, and other text-based data sources. Text mining is also increasingly used in AI chatbots and virtual agents that companies use to respond to customers automatically as part of their marketing, sales, and customer service operations.

Here is the code used in this tutorial:

import nltk

text = '''Hello Mr. Jones, how are you doing today? The weather is great, and city is awesome.
The sky is bright-blue. You should't call for meeting today'''
tokenized_text = sent_tokenize(text)
print(tokenized_text)

tokenized_word = word_tokenize(text)
print(tokenized_word)

frequency = FreqDist(tokenized_word)
print(frequency)

print(stop_words)

filtered_sent = []
for w in tokenized_text:
if w not in stop_words:
print("Tokenized Sentence: ", tokenized_text)
print("Filtered Sentence: ", filtered_sent)

ps = PorterStemmer()
stemmed_words=[]

for w in filtered_sent:

print("Filtered Sentence:", filtered_sent)
print("Stemmed Sentence:", stemmed_words)

lem = WordNetLemmatizer()

stem = PorterStemmer()

word = "Working"

word = "Flying"

sentence = "Albert Einstein was born in Ulm, Germany in 1879"
print(tokens)

Рекомендации по теме

Комментарии

This is amazing, well structured and right to the point in the explanation, thanks. I am really interested in Text mining and Text analytics, please I would love to see more about it.

Gorzkun

Thank you for this video! I have a question: after setting the stopwords and looking at the filtered sentence (19:53) : why is the filtered sentence equal the tokenized sentence when the stopword list includes e.g. doing? Shouldn't it be deleted from the filtered sentence? An explaination would help me a lot. Thank you!

mentalresilience

I've been checking what I have this type of error. Hope you can help.

TypeError Traceback (most recent call last)
in <cell line: 8>()
6
7 word = "Working"
----> 8 print("Lemmatized Word: ", lem.lemmatize(word, "v"))
9 print("Stemmed Word: ", stem.stem(word))
10

TypeError: 'tuple' object is not callable

cristopherespiritu

Text Mining Basics in Python

Text Mining Basics in Python

What is Text Mining?

Text Mining in Python | Natural Language Processing | Intellipaat

Natural Language Processing (NLP) & Text Mining Tutorial | Machine Learning Tutorial | Simplilea...

Natural Language Processing with spaCy & Python - Course for Beginners

Simple Sentiment Text Analysis in Python

What is Text Mining?

Text Mining with Machine Learning and Python: Understanding Text Data Sources| packtpub.com

Python | SpaCy | NLP | Scattertext to visualize characteristic words based on category | Video 1/2

Introduction to Text Analysis in Python

Simple text processing in Python with TextBlob | Python NLP Tutorial

Text Mining and NLP Tutorial | Natural Language Processing Explained | Edureka | NLP Live - 1

Text Classification Explained | Sentiment Analysis Example | Deep Learning Applications | Edureka

Learn Python Sentiment Analysis (Quick Tutorial)

Introduction to text analysis in python

Text Mining with Machine Learning and Python: Word Search Versus Entity Extraction| packtpub.com

Text Preprocessing | tokenization | cleaning | stemming | stopwords | lemmatization

Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Training | Edureka

NLTK Tutorial 01: Text Analysis | NLTK | Python

Python TensorFlow for Machine Learning – Neural Network Text Classification Tutorial

Python Sentiment Analysis Project with NLTK and 🤗 Transformers. Classify Amazon Reviews!!

Text Vectorization NLP | Vectorization using Python | Bag Of Words | Machine Learning

Common Steps in a Text Mining Project

Natural Language Processing In 5 Minutes | What Is NLP And How Does It Work? | Simplilearn