How to Fine-tune LayoutLMv3: Fine-tune LayoutLMv3 with Your Custom Data | Part -3 Fine tuning

preview_player
Показать описание
In this tutorial, we will learn how to fine-tune LayoutLMv3 with annotated documents using PaddleOCR. LayoutLMv3 is a powerful text detection and layout analysis model that can be used to extract text from documents. PaddleOCR is an open-source OCR system that supports a variety of languages and document types.

To fine-tune LayoutLMv3 with annotated documents, we will need to:
1. PaddleOCR
2. Label-studio
3. Transformers - huggingFace

LayoutLMv3, Fine-tune, Annotated Documents, PaddleOCR, Text Recognition, Document Layout Analysis, Computer Vision, Natural Language Processing, Deep Learning
Рекомендации по теме
Комментарии
Автор

The best video I've ever seen for layoutLM. Where is the part 4? Audio is not clear at the end of the video. If possible go over python inference so that viewers can understand clearly. Keep rocking 🎉🎉

richierosewall
Автор

Thanks Mani for the detailed video. It helps a lot. Please share the solution for inference as well along with Docker.

gibsosmart
Автор

Hi Mani, I did launch the webserver with auth and I can access the images, uploaded the json, but in Label Studio, if I swap the ocr field to 'img' from 'string' it won't show the image, (brokenData)? Any idea?

truehighs
Автор

Thank you for the tutorial and it's quite helpful! I want to know when the inference part will be available?

dymefkz
Автор

Thank you for the tutorial! The bboxes coming from label studio seems to be from 0 to 100, but layoutlmv3 still requires 0 to 1000, should I multiply by 10?

williamliu
Автор

when do you publish part 4 - how to use the trained model? Great job!

darwink
Автор

Can we annotate multiple values from single key ? For example if there are multiple entries in a pdf or there is a table and I want to extract all the rows for a single table header?

koyelimajumder
Автор

Hi, thanks for the video. When I run main, I get a "RuntimeError: grad can be implicity created only for scalar outputs". Can anyone help me out how to solve this?

Lucifer
Автор

What version of transformers do you use? because I'm getting this error when I run main.py : ImportError: cannot import name 'PreTokenizedEncodeInput' from 'transformers'

esrplzm
Автор

I'm getting following error while training model "ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`labels` in this case) have excessive nesting (inputs type `list` where type `int` is expected)."

narendrabhole
Автор

Thanks for the nice video! :) I have some questions. 1. How was your f1 score?, 2. Is it possible to extract the data of a specific key pair like date or invoice number in json format after model training? If so, how?

yeojinkim
Автор

I am getting following error "ValueError: Expected input batch_size (2048) to match target batch_size (1024)." please help me to resolve this issue

koihoij
Автор

After running the main.py file i am getting the below error, how can I resolve this??
ValueError: Expected input batch_size (1536) to match target batch_size (1024).

chandanha
Автор

Can we train the model using GPU?
If yes how do we edit the code to do so?
Am training it on Colab but it isnt using GPU to train.

kztqohc
Автор

at 4:10 Training_LayoulLMV3 can we use is for training LayoutLM model

mohammedmuzammilkhan
Автор

hey I am getting the below error when trying to run "Main.py" file

"The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'LayoutLMTokenizer'.
The class this function is called from is 'LayoutLMv3Tokenizer'."

Can you please helpppp

parthmodi
Автор

Can you please share some resource on how to create a dataset for donut model

avi_
Автор

I'm trying this to execute this on colab..however getting following error while executing Main.py code block grad can be implicitly created only for scalar outputs.
how we can create this entire script for google colab

narendrabhole
Автор

Hello @AIOdysseyhub Thanks for making me to understand full flow of LayoutLMv3 in all 3 parts but I am waiting for the 4th part so please can you give me the update

shreyanshsahu
Автор

I'm getting this error when I run main.py : PreTokenizedEncodeInput must be Union[PreTokenizedInputSequence, Tuple[PreTokenizedInputSequence, PreTokenizedInputSequence]]

esrplzm