Document Querying with Qwen2-VL-7B and JSON Output

preview_player
Показать описание
In this video, I demonstrate how to perform document queries using Qwen2-VL-7B. By simplifying field names, we streamline the prompts, making them more efficient and reusable across different documents. This approach is similar to running SQL queries on a database, but tailored for language models like Qwen2-VL-7B, with results returned in JSON format.

Colab:

Sparrow GitHub repo:

0:00 Intro
1:15 Sample doc
1:34 Colab notebook
4:38 Inference
6:44 Query 1
7:40 Query 2
8:38 Query 3
11:00 Summary

CONNECT:
- Subscribe to this YouTube channel

#qwen2 #vllm #ocr
Рекомендации по теме
Комментарии
Автор

Hi thank you for your amazing video. Do you know how to fine tune the qwen2 for this case using our own dataset? Thanks!

hadyanpratama
Автор

Which OCR do u recommend to use along with this model for hand written dara extraction. I used tesseract the results are not promising.

hsnavas
Автор

That's impressive accuracy, thanks for showing this. I wonder how it would do if I wanted to add fields that are use case specific? I'll have to give it a try for sure. Thanks again.

kenchang
Автор

Hey great video! I have always the problem that my colab run out of memory even if i am running on A100, tried also your notebook but always the same at
# Inference: Generation of the output
generated_ids = model.generate(**inputs, max_new_tokens=1024)

do you know any solution?

cristiantironi
Автор

How would this handle a PDF consisting of images/diagrams? E.g technical documentation

kareemyoussef
Автор

Could you please share invoice document?

harunulrasheedshaik