How To Extract Text from Scanned PDF Using NoelOCR - Python

preview_player
Показать описание
NoelOCR is a Python Library for extracting text from scanned PDF and full text PDF developed by Noel Moses Mwadende.

How it works ?

processPDF module from NoelOCR takes scanned PDF process it and output searchabel/plain text.

For whom it was developed ?

It was developed for Machine Learning engineer who deals with PDF. It might be classification of scanned PDF, text extraction from scanned PDF or any task which requires feature extraction from scanned PDF. Not only that, NoelOCR is very flexible as it can also extract text from full text PDF. That means it works for both, full text PDF and scanned PDF though it was purposefuly created for scanned PDF.

How to use it ?

import NoelOCR as nm

print(text)

#Convert #scannedPDFtoFulltextPDF #ToTextPython #NoelOCR #Extract #PythonOCR #OCRmyPDF #LearnPython
Рекомендации по теме
Комментарии
Автор

I got an error stating "something went wrong" when I tried to read my file. do you know what they could mean?

esosaosayamwen
Автор

Many thanks for your solution.
With another language ( Example: Vietnamese language), NoelOCR work?

oanminhtruc
Автор

hello how are you, i am using kali on virtual box and it gives me Something is Wrong could any body help

mohamednada
welcome to shbcf.ru