Python - Text Summarizer with NLTK

preview_player
Показать описание
How to develop Text Summarizer with Natural Language Processing Module NLTK
This part covers -
1. Data collection from web through Web-scraping
2. Data Cleanup (Like special characters, numeric values, stopwords, punctuations etc)
3. Tokenization - Creation of tokens (Word Tokens & Sentence tokens)
4. Calculate Word Frequency for each word by excluding stop words
5. Calculate Weighted Frequency for each word
6. Calculate Sentence scores based on each word within sentence
7. Creation of summary with top 10 highest scored sentences
Рекомендации по теме
Комментарии
Автор

It looks like you have not installed BeautigulSoup which is web content scrapper library to extract data from HTML web pages or any other websites... By seeing Path in error, it looks you are on WIndow machine so please go to command line & run this command - pip3 install beautifulsoup4 if pip3 is not install then run easy_install install beautifulsoup4... Please let me know if you face any other issue....

renujain
Автор

hi... thanks .. it is useful to me, . i followed the steps you have shown. when I run it after first print command.I encountered errors as shown below
RESTART:
Traceback (most recent call last):
File "C:/Users/pc/AppData/Local/Programs/Python/Python36/textsummarizar.py", line 2, in <module>
from bs4 import beautifulsoup as bs
ModuleNotFoundError: No module named 'bs4'.
it is looking for bs4, ,
how can i get it, should I create?
i am using python 3.6.6

dr.m.kumarasamy
Автор

hi., thanks ., it is useful to me, . when I run it, I encountered error in this line 'if word not in word_frequencies.keys():' as shown below,
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
AttributeError: 'tuple' object has no attribute 'keys'

revathianbarasan
Автор

idk why but when I am doing it, the ratio is coming>1 for some words

vasudhatapriya
Автор

I cant run this part successfully. Please can you help me ? Error is word_tokens not found although i have created it.


word_frequencies = {}

for word in word_tokens:
if word not in stopwords:
if word not in word_frequecies.Keys():
word_frequencies[word] = 1
else:
word_frequencies[word] += 1

rajeshwarijedhe
visit shbcf.ru