Tesseract OCR: Extract Text From Any Image

Показать описание

Have you ever needed to extract text from an image, maybe you took a screenshot of something or you need to get a transcript of a meme, well luckily for you Tesseract OCR exists to do exactly that.

==========Support The Channel==========

==========Resources==========

=========Video Platforms==========

==========Social Media==========

==========Credits==========
🎨 Channel Art:
All my art has was created by Supercozman

#Tesseract #OpticalCharacterRecognition #TesseractOCR #Linux

🎵 Ending music

DISCLOSURE: Wherever possible I use referral links, which means if you click one of the links in this video or description and make a purchase I may receive a small commission or other compensation.

Рекомендации по теме

Комментарии

I've been a subscriber to your channel for a couple of years and I like all of your videos, but this is my favorite kind of video -- about a *useful*, open source tool or utility, not specific to Arch linux, not about gaming, not about drama in the industry. I'm not saying the other kinds are bad or that you shouldn't make them, but I like this kind the best.

code

I have used tesseract a bit for digitizing recipes from recipe books. When it does not give good results on a first past I have found that altering the image can help a lot. Altering the image to black and white, altering the contrast, and even enlarging the image can all improve results.

Mpickles

So if Google is maintaining this project, is Google Lens just a front-end for Tesseract?

Fooftilly

I started using tesseract for a project to gather the text off my memes hosted on my personal szurubooru (In an attempt to be able to search for set text, so you are able to actually find stuff within the thousands of images).

It has been very hit or miss, sometimes it gets text right down to the punctuation, other times it gets nothing, on low res bad contrast images were I think it has no shot it gets it, on clean images it gets nothing. Sometimes doing crazy image manipulation helps, sometimes unmodified is best.
What I can say is that handwritten Latin letters are impossible for it, so manga scanlations text is just blank for it, at least with the English language setting

dergeneralfluff

A little script can be done alongside a screenshot utility to get OCR from screenshots directly to the clipboard

t

you have video about every topic. that's awesome.

muctebanesiri

I use a keybinding to invoke a script which takes the screenshot of an area, pipe it to tesseract and copy the resulting text contents to the clipboard.

atomixhawk

i use tesseract in python to read text and train the machine. good job explaining this 👏

sazk

the sad thing about pytesseract is it works as long as the background of your image is of semi-color, other than that it would mess up everything.

khaibaromari

Thanks for the helpful video Brodie. Do you know if you can use Tesseract to convert a non-OCR'ed PDF into a PDF that contains OCR'ed text?

EastEndKeith

OH lol, something just hit me, Google lens uses their own Tesseract OCR for extracting text and send it to your PC where you are logged in with your google account.

damarh

This seems fairly nice for searching tango. Maybe you should also check if it can do well checking the words 1 by 1, perhaps with some other ways of framing them. I am curious how it works on Middle Eastern languages like arabic and hindi though.

someonestolemyname

Anyone, know how could I add a second language in the same command line? I tried the next command and it doesn't work: tesseract filename.jpeg - -l ara[+spa] filename.txt

ELHASSANEMOUMADARFAK

Are you able to input a URL instead of a local file on your PC? This would be very useful.

solidhyrax

Excellent. I made the mistake of writing a couple thousand small notes in the stock Samsung notepad on my phone, and it turns out the garbage developers only allow you to bulk export them as PDFs instead of plain text. This will come in handy.

davidr

Any one know how to set xsane to use tesseract?

patrickmclaughlin

Google can't just repeat that, google drive to google docs conversion beats tesseract

mattaku

See your odyssey tips and tricks for my comment

uksuperrascal

I'm a huge japan-fan, love the culture, the food and people... but anime/weebs? Cringe.

bologna

I found use --oem=1 helpful, it forces to use the new ml model which helps a lot of cases

leoliu

Tesseract OCR: Extract Text From Any Image

Using Tesseract-OCR to extract text from images

Tesseract OCR: Extract Text From Any Image

PYTHON OCR EXTRACT TEXT FROM SCANNED IMAGE PDFS | TESSERACT OCR WITH PYTHON #OCR #TESSERACT

Extract text from images with Tesseract OCR on Windows

Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr

Tesseract js | React js | OCR

Extract Text From Images in the Browser (Using Tesseract OCR)

How to Extract Text from Image using Python and Tesseract (OCR)

Extract Text From Images in Python (OCR)

Tesseract js | React js | OCR | Tesseract with React | Image To Text Conversion with React

How to use Tesseract OCR in a Python script (pytesseract)

Pytesseract - Convert image to text using Python in just 3 lines of code

How to Install and Use Tesseract OCR on Windows - Optical Character Recognition

Extract text from Image using Python: OCR 😮🐍#coding

Using video2ocr / Tesseract-OCR to extract text from video

How to use Tesseract OCR with Java? | Extract text from image

With Veryfi’s advanced image detection, you can voilà your way to easy recipet data extraction! #ocr...

Extract Text from Image with Tesseract OCR

Python Tutorial | How to extract text from images | Tesseract-OCR Engine. | For Single File Only

How to extract text from an image | tesseract-ocr | simplest way !!

Extracting Text from Png files in R | Tesseract Package | OCR | R Studio

Extract Text from Video - images | Tesseract

Extract text from Any PDF File (even scanned ones) using OCR pytesseract in 3 SIMPLE STEPS!

Python Extract Text from Scanned PDF | Python Extract Text from Image | Python Tesseract OCR Setup