How to use Tesseract OCR in a Python script (pytesseract)

preview_player
Показать описание
In this video I demonstrate how to use Tesseract OCR to extract text from images from within a Python script.

Extract text from the browser:
Рекомендации по теме
Комментарии
Автор

Thank you. I was expecting a bad video because of the view count but this Got right to the point.

YorukaValorant
Автор

Nice video! Thanks

Is there a GUI that you recommend to use in windows?

marceloortiz
Автор

How are you, do you know how can I include the tesseract OCR executable in my python executable file? That way when I distribute my executable other users can use the OCR without installing the machine on their device.

Mark_Morad
Автор

00:14 Only if you use it for English, Russian or Chinese Text everyone!

Ueberkombo
Автор

Hi, do you know how it would be possible to do live detection with my webcam?

AkhilNagori-vu
Автор

I have some 2000 pdf files which are invoices. I want invoice number, date and total amount from them... Many invoices are of different format . What the nest way to do it?

YuvrajWithAGuitar
Автор

How can I edit this script to extract text from scanned documents? Thanks.

derekegenti
Автор

Hello, please I would like to know how to improve the precision of tesseract without labeling. I am currently working on an invoice ocerization project, and the problem I encounter is that I have a huge variety in the format of my invoices, I would say nearly 4000 to 5000 different formats, and the problem I encounter with my OCR (I use tesseract) is that it extracts the raw text without taking into account that it is an invoice (the zones etc...), it retrieves the information line by line, I cannot label it given the number of invoice formats, what do you offer me for this? Can bert or spacy be useful in this case?

stevetedom
Автор

I do not understand ! you made a video very quickly. I can't understand

Mollory
Автор

You need to slow down when explaining and show steps involved pls

adejobiolajide