Getting Started With Azure Document AI Document Intelligence API In Python (Source Code In Desc)

preview_player
Показать описание
Stop paying for expensive PDF to Text software, and to be honest, most of the free ones don't even work that well. Azure Document Intelligence is one of the AI services to build document processing solutions to analyze and extract information from your documents automatically. In this video we are going to learn how to use Azure’s Document AI Document Intelligence API in Python to extract data from tax forms like W2, 1099, invoices, receipts, and even bank statements.

📙 PDF download source

💖 Show Support
☕ Venmo: @Jie-Jenn

00:00 - Intro & agenda
02:11 - Common use cases
02:52 - Free tier & pricing info
04:06 - Free tier limitations
05:41 - Available Document Intelligence models
07:10 - Install Azure Document Intelligence Python packages
08:47 - Create & setup Azure Document Intelligence resource
16:11 - Example 1: Tax Form (W2) data extraction
36:18 - Example 2: Extract data from invoices
45:02 - Example 3: Tables extraction

#azure #python #ai
Рекомендации по теме
Комментарии
Автор

What else do you want to see? Let me know in the comments below!

jiejenn
Автор

Great video! Thanks for sharing.
Can you please share with us your github repo as well? I see that you are importing utility in the invoice extraction code, but I couldn't find it anywhere. Would really appreciate it.

ethanphan
Автор

You just saved me bro, thank you so much for this content

arturgomes
Автор

Hi Jie, It would be possible to show how to ingest a PDF document, not a formatted one, as a raw document to analyze the data. (e.g. clinical trials) and how to store in azure SQL or Synapse using a data factory to feed an ML model to be used in Copilot or any other bot. Thank you.

gfranco
Автор

Hey, great video! Which approach do you think is better for extracting specific documents/images patterns (say an ID from a country) in another language (maybe even handwritten) when a user uploads the file and for returing the data to him: Tesseract, Google Cloud Vision OCR, Azure or AWS Textract? How can i make it read the file the user uploaded and return to it the data so he can copy, for example? Many thanks!

joaoarthurbandeira
Автор

Great tutorial, kindly, what is the theme (color) you are using in VS code

heshamelkouha
Автор

What about azure computer vision? I Don´t knnow much about azure, but I thought azure cv was the tool used to extract information from pdf or images. Is this Document AI is some sort of the evolution? Again im new, excuse my ignorance

UiPath_ESP
Автор

Very useful and very good (Muito util e muito bom). Tks (Obrigado)

NSLABTUTORIAIS
Автор

hi, just a question. I have this project in my bachelor thesis. The pdf files are send to backend(c# .net framework) from frontend(angular) now I that I have list of pdf files in my backend how could I send it to Document Intelligence? I already trained my models and I have blob storage but i just cant figure out and i dont know the next step on how to send it to my custom model?

IBAAN
Автор

Where did you get the beginning template for the project?

tuikkumies
Автор

Hi, I am stuck in that bounding box part, how to create the boudind box from the given polygons?
can you tell me how can I convert it into x, y, w, h format.

Thanks

sumanpaudel
Автор

can you extract from .doc files? Document Analysis seems to only work for docx

surrendereverything
Автор

can you please show how to use custom models

yashub
Автор

Can you share the code for our practice

FF_Bechlor_Life
Автор

There is an open source python ocr. How is this different?

ohcrapitsmrG
Автор

Has anyone been getting the following error?
(404) Resource not found
Code: 404
Message: Resource not found

hello_world