How To Do OCR on Videos

preview_player
Показать описание
How to extract video content - YouTube example
Full demonstration of how to use Python to download videos and then EXTRACT TEXT from them using pytesseract and crop the saved images using Pillow (PIL).

As an aside I also show you you could get the text/transcript if that is what you need - although the transcript is also auto generated and not 100% accurate either.

How to scrape videos (content) - from YouTube*
* could be from any video/website

OCR and Image manipulation with Tesseract and Pillow
(Python-tesseract is an optical character recognition (OCR) tool for python).

What this video covers:
⭕ How to download a video with Python - using youtube-dl
⭕ Extract TEXT from video
⭕ Extract images from video

Using the following :
-Use pip install youtube_dl to get video
-Use Tesseract to get TEXT
-Use Pillow to crop images

--- chapter timings ---
00:00 Intro
01:33 youtube dl
06:24 downloading a video
14:00 processing the video
22:17 looking at the trimmed image files
23:10 looking at the text output

👍 Become a patron 👍

Buy Dr Pi a coffee (or Tea)

Visit redandgreen blog for more Tutorials
=========================================

Subscribe to the YouTube Channel
=================================

Follow on Twitter - to get notified of new videos
=================================================

Proxies
=================================================
If you need a good, easy to use proxy, I was recommended this one, and having used ScraperAPI for a while I can vouch for them. If you were going to sign up anyway, then maybe you would be kind enough to use the link and the coupon code below?

You can also do a full working trial first as well, (unlike some other companies). The trial doesn't ask for any payment details either so all good! 👍

◼️ Coupon Code: DRPI10
(You can also get started with 1000 free API calls. No credit card required.)

YouTube Video Download
YouTube Video Scraping
YouTube Video Text Extract
YouTube Image to Text
How to scrape video content

Thumbs up yeah? (cos Algos..)
#dataextraction #DownloadYouTubeVideos
#webscraping #videos #python
Рекомендации по теме
Комментарии
Автор

Man you finally did it! I'm amazed! So simple, so straight forward, so transparent!

monkey_see_monkey_do
Автор

Hello! Do you know how to take text on the screen from TV programs (I mean, that from 360p till 720p quality or lowless quality - not sure) & translate it or put it in the the .txt-file? I was used some OCRs: Tesseract OCR, PowerToys and some other, but they recognize the text not correctly. And if some of screenshots of TV Programs Or Channels text recognizing sometimes works, on other don't working. Do you know what to do?

I don't know any AI what can do this correctly for english & non-english languages...

I need to recognize text on the video to .txt-file.

АлександрКонюхов-эщ
Автор

Is there a Python notebook on your GitHub that would make it easier to copy and modify the code?

ankitranjan
Автор

Re: captions / transcripts. Got it. Will noodle around for some batch processing.

michaelmody