Optical Character Recognition with Python and Tesseract - Reading Text from an Image

preview_player
Показать описание
In this video, we learn how to read the text from an image into a Python application, by using Tesseract to perform Optical Character Recognition.

We read in an image to a Pillow Image object, and then use pytesseract to read the text from that image.

OCR gives us a method for automating the reading of text from images/videos into machine-readable form.

We can then perform text analysis, or store the text in a database or as a file. In this video, we quickly demonstrate one possibility - finding named entities using Spacy.

📌 𝗖𝗵𝗮𝗽𝘁𝗲𝗿𝘀:
00:00 Intro
01:00 - Tesseract Setup
02:45 - Reading Image with Pillow
03:25 - Working with pytesseract
05:11 - Finding Named Entities with Spacy

☕️ 𝗕𝘂𝘆 𝗺𝗲 𝗮 𝗰𝗼𝗳𝗳𝗲𝗲:
To support the channel and encourage new videos, please consider buying me a coffee here:

𝗦𝗼𝗰𝗶𝗮𝗹 𝗠𝗲𝗱𝗶𝗮:

📚 𝗙𝘂𝗿𝘁𝗵𝗲𝗿 𝗿𝗲𝗮𝗱𝗶𝗻𝗴 𝗮𝗻𝗱 𝗶𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻:

#python #ocr #datascience
Рекомендации по теме