Using video2ocr / Tesseract-OCR to extract text from video

preview_player
Показать описание
Video video demonstrates how to use the Tsurugi Linux video2ocr script to extract text from video. video2ocr uses ffmpeg to create screenshots of a target video file, then converts the screenshots to greyscale and uses Tesseract-OCR to extract text from the resulting images. Digital investigations often benefit from optical character recognition (OCR) of images, PDFs and video. video2orc is a useful tool as long as you understand its limitations.

00:00 Introduction to Video2OCR
00:13 Video2OCR in Tsurugi Linux
00:29 Video2OCR and Tesseract OCR
00:43 Video2OCR help menu
01:34 Video2OCR target videos folder
01:59 Set up for video processing
02:31 Example video explanation
03:41 Running Video2OCR
03:51 Check which languages tesseract OCR supports
04:09 Specify frames per second
04:57 Monitor Video2OCR progress
05:25 The Video2ORC process
06:27 OCR analysis results
11:21 Lessons learned from Vide2OCR
12:25 Conclusions

We demonstrate how use video2ocr, and show some of it's strenghts and weaknesses. A simple powerpoint video is used as a case study.

Links:

010001000100011001010011011000110110100101100101011011100110001101100101
Help make DFIR tutorials

010100110111010101100010011100110110001101110010011010010110001001100101

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please link back to the original video. If you want to use this video for commercial purposes, please contact us first. We would love to see what you are doing and will probably allow its use.
Рекомендации по теме
Комментарии
Автор

Sir currently I am doing my internship as cyber crime and digital forensic investigation I have learned lot of things from thank you for that my question is recently I got the MacBook and my task is to find how many usb devices has been connected to the MacBook so for that which tool I can use?

anuragjadhav
Автор

Hello, thanks for your video. I'm looking for somting similar but easier cause i m not good with computer :P. I use a thermometer and i would like to film it to extract the temperature every 30 secondes from the video. Is it possible with an easier way than your video ? Thanks ;)

nathaliemartin
Автор

Existe t'il un moyen simple d'extraire l'intégralité des textes inclus dans une vidéo ? (pas l'audio, ni les sous titres générés par Youtube) sur PC ou smartphone. Impossible de trouver cet info (pourtant ca se fait facilement avec une image et de l'OCR, et une vidéo ca n'est jamais qu'une succession d'images)

ffmax
Автор

Thanks man :)

It's weird that a single tool doing all of that doesn't exist yet...
In the old days (more than 20 years ago)
we had SubRip to OCR entire DVD Vobs into SRTs...
It was way easier to use (the process was interesting as well).
I know the process was different here and maybe easier in some ways as it was a special file with transparent pictures (or was it video I have to look in to that) with only text on it embedded in the Vob file itself, but still.

dupirechristophe
Автор

Awesome video, btw can I use this in a livestream video?

franznikkoisip
Автор

Nice video man, I did this with Python a few weeks ago using Tesseract 👍

python
Автор

If you're not using OCR in your 🔦investigations, you could be missing a lot of documents!

DFIRScience