How to extract text from multi-page PDF & save it as CSV - Amazon Textract tutorial p4

Показать описание

Welcome to the part 4 video of the tutorial series on Amazon Textract. In this video, I have covered how to extract text from a multi-page PDF file and save the output as CSV.

---
Support my work:
---
Paytm | Gpay: 9023197426

---
Series Tutorial
---
---
Another channel:
---

---
Connect with me
---

Рекомендации по теме

Комментарии

Excellent walkthrough of pdf to csv solution. But I can't seem to import panda as indicated. How do I get a little help?

dougchristensen

Hi there! Thank you so much for this! Helped me out massively :) Sorry if I missed something, but I was wondering, is there a reason you use SNS instead of an S3 event to trigger the "process textract response" lambda? Would it be possible to skip out SNS and have the same effect with just another S3 event trigger?

JackRobinson-jjcn

This was an amazing video thanks so much! Question for you...how would I get the CSV outputs to match the input PDF file names instead of the Job ID string. So basically, I want input.pdf to return input.csv instead of that long string of numbers.csv

testeleven

Hi Chirag. Thanks for doing this video. By the way I need to know something, can we save the csv with the same name as the pdf. Hope to hear from you. Thanks again

georgevavolil

Hello and thank you for the video! I am having trouble when testing the first lambda function, async_job_creation. I do not see anything output in the Cloudwatch logs when I save a PDF to the S3 bucket. I receive the message "log group does not exist." Any suggestions? Assuming the Lambda function is not being triggered?

brittanyross

Thank you for the amazing video. I have liked and subscribed to your channel. I had a question about the workflow. Right now, the pipeline runs when a single file is uploaded to S3. If I have some kind of UI where I let the user upload multiple files. Then for each file uploaded, there will be two lambda functions and 1 aws textract running in parallel for each file. How can this be made more efficient for multiple file uploads ? Lets say the user uploads 3 files. Is there a way a single lambda can process those 3 files, send to aws textract which writes 3 separate outputs to /textract-output and then another single lamba fn that could process those 3 textract-outputs and write them to 3 separate files into the /csv folder with the appropriate file names? Let me know if that makes sense. And again thanks for you excellent video.

aleenaselegy

Hello, can i save the output csv files on my local machine instead of bucket??

achrafbhiri

anyone else here getting the JSONDecodeError: Expecting value: line 1 column 1 ? I keep getting this and I cannot figure out why :(

music-ish

What modifications would you make to get the API process forms Key-Value pairs? Having trouble trying to understand that

curiousl

cannot find the example pandas_39.zip file in the repo shown. Is there a way to download that to follow along? Thanks for your video

aleenaselegy

can you please upload pandas zip file because its missing on github

harbindsingh

I can't find pandas_39.zip file, can you help me.

thanhba

How to extract text from multi-page PDF & save it as CSV - Amazon Textract tutorial p4

How to Copy Text from Image #windows

Extract text from any picture using the Snipping Tool in Windows 11

How to Copy Text from Image

How to Extract Text from PDF? 📃

Excel Pro Tip: How to Easily Extract Numbers from Cells

Onenote: How to Copy Text from an Image 🤯 #shorts

How To Extract Text From An Image

How to Extract Text from a Picture

How to extract text from the right in #excel #exceltips #exceltricks #viralshorts #viralshort

Extracting Text Before a Given Character with TEXTBEFORE Function in Excel

How to Extract Text from PDF in Python | PDF Text Extraction Tutorial (2025)

Excel Pro Trick: Extract Text and Numbers from String in Excel with LET SEQUENCE MID Excel Functions

Extract Text from Any Document with AI (OCR Tutorial)

OMG😱 Copy Text from IMAGE 💥💯Microsoft PowerToys ⏰Time Saving Trick #shorts #ytshorts #computertricks...

How to Extract Text from PDF using Python

🔥Convert Image to Text in MS-Word #shorts #computertricks #ytshorts #tipsandtrick

How to Extract Part of Text String from an Excel Cell

How to extract text from images

Extract Text From Images & PDFs Using AI (n8n tutorial)

Excel - 3 Cool methods to extract text from the beginning of a text string

How To Extract Handwritten Text From Image

Tutorial video on how to extract text from images using the Document Scanner App

Extract Text From Images & PDFs Using AI (n8n tutorial)

How to Extract Text From Image / OCR Using your Smartphone Camera | HOWISIT