Question Answering on Tabular Data with HuggingFace Transformers Pipeline & TAPAS

preview_player
Показать описание
In this video, I'll show you how you can use HuggingFace's Transformers pipeline : table-question-answering. You can use this for answering questions related to a table.

The TAPAS model was proposed in TAPAS: Weakly Supervised Table Parsing via Pre-training by Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno and Julian Martin Eisenschlos. It’s a BERT-based model specifically designed (and pre-trained) for answering questions about tabular data. Compared to BERT, TAPAS uses relative position embeddings and has 7 token types that encode tabular structure. TAPAS is pre-trained on the masked language modeling (MLM) objective on a large dataset comprising millions of tables from English Wikipedia and corresponding texts. For question answering, TAPAS has 2 heads on top: a cell selection head and an aggregation head, for (optionally) performing aggregations (such as counting or summing) among selected cells. TAPAS has been fine-tuned on several datasets: SQA (Sequential Question Answering by Microsoft), WTQ (Wiki Table Questions by Stanford University) and WikiSQL (by Salesforce). It achieves state-of-the-art on both SQA and WTQ, while having comparable performance to SOTA on WikiSQL, with a much simpler architecture.

Join this channel to get access to perks:

If you do have any questions with what we covered in this video then feel free to ask in the comment section below & I'll do my best to answer those.

If you enjoy these tutorials & would like to support them then the easiest way is to simply like the video & give it a thumbs up & also it's a huge help to share these videos with anyone who you think would find them useful.

Please consider clicking the SUBSCRIBE button to be notified for future videos & thank you all for watching.

You can find me on:

#huggingface #NLP
Рекомендации по теме
Комментарии
Автор

I'm working on a natural language parser for database queries as part of a placement project. Decided to uses tapas from huggingface and what are the odds that the day I'm about to start working you upload this amazing video that makes my life so much easier, keep up the great work!

randomdudewithnovidz
Автор

Hi Bhavesh, I have a use case where I need yo do a reverse of this. Like "Update score 189 for Virat Kohali. Change his team to Australia." Which model to use here? Instead of score one can use runs also.
Please help.

vbarai
Автор

are there any limitation for the size of CSV file? say i have a CSV of 3gb?

SmartAzan
Автор

Does it work on any dtabase query? using a chat interface and generic database. This one is CSV based not a full fledged table.

ditchtech
Автор

is there any model which supports for both german and english with Table quesiton answering?

chandrantwins
Автор

can we make it as an input other than a query so a user can ask a specific question he wants?

buu
Автор

How to work with table size>512 tokens?

kunalkasodekar
Автор

hi this is awesome
how do i fine tune it to perform other tasks

al-aminibrahim
Автор

Do we need to pass the table always for every query we ask, how it scales for larger tables did you give a try ?

RavikiranBhonagiri
Автор

This works for a very small data set merley 30 rows. If i have data set which has say 1000 rows it gives the error Out of Range.

adityakaran
Автор

Thanks for the video....from transformers import pipeline doesnt work when i idea whats the issue?

David-rmwn
Автор

Can anyone tell me how do i get the total sum of Runs?
this is the op i am getting when i run the query
"what is the sum of Runs?"
SUM > 18426, 14234, 13704, 13430, 12650, 11867, 11739, 11579, 11363, 10889

mohandas
Автор

Brilliant
Please make a video on hypothesis testing, chi squared test, p value, t-test
Would like to hear it from you

charmilam
Автор

Amazing video as usual. My question might sound silly but can you please let me know why you have used 'q' while installing transformers and 'f' while installing torch-scatter ?

rohitjagdale
Автор

How much rows it can handle at a time?
With time complexity

krisskad
Автор

Sql is far from gone. It will be there for atleast next decade and is the most important skill set to become a data scientist. More important than nlp or deep learning

sahil
Автор

Why did it show the answer for Virat Kohli's highest score as "AVERAGE > 183" and not "183" ?

akshaysarbhukan
Автор

Take AI otherwise It will take your job

shivamkumar-qpjm