Get Receipt Data with Hugging Face ML Model

preview_player
Показать описание
This tutorial is about how to use fine-tuned Hugging Face model to extract data from scanned receipt documents. We are executing inference action - passing receipt image, along with words and coordinates to the model. As a result, we get back predictions - class labels assigned to each input. This helps to classify document elements and extract correct data. I share a hint on how to match input words with classified labels. Input words and coordinates are expected to be retrieved from separate OCR.

Colab:

GitHub:

0:00 Introduction
1:50 Sparrow
2:40 Demo in Colab
3:15 Dependencies
3:45 Dataset
5:00 Data structure
6:50 LayoutLMv2 Processor
7:50 LayoutLMv2 Model
8:50 Inference results
12:25 Getting data
14:30 Summary

CONNECT:
- Subscribe to this YouTube channel

#HuggingFace #PyTroch #Python
Рекомендации по теме
Комментарии
Автор

How to create my own dataset which I can use for Layoutlmv2 model building.
Which format I have to keep the annotation dataset
Please tell me.

sebabrataghosh
Автор

This is interesting.
Can the iinvoice be in other languages like Portuguese and Spanish?

venusdev
Автор

Hello,
Thank you for the great tutorials.
can you please explain how to train a invoice with multiple pages
Many Thanks

drissdoukkali
Автор

How to get list of goods in a structured format ?

AriefWijayaisMRAW
Автор

Hey can you please tell, how to use hugging face library after performing OCR on invoice images dataset, as i have only the raw text from OCR .

SarikaLozy-yser