Tutorial: Text Classification using GPT2 and Pytorch

preview_player
Показать описание
Text classification is a very common problem that needs solving when dealing with text data. We’ve all seen and know how to use Encoder Transformer models like Bert and RoBerta for text classification but did you know you can use a Decoder Transformer model like GPT2 for text classification?
In this tutorial, I will walk you through on how to use GPT2 from HuggingFace for text classification. We will start with downloading customized dataset, installing required componments, selecting pre-trained models, and then train the model. we will finally evaluate the results and how to optimize further.

Speaker: George Mihaila
Resources:
Рекомендации по теме
Комментарии
Автор

It is really clear and good video. I really recommend this video!!

cendradevayanaputra
Автор

Q&As:
00:37:50 Bone (USA): Can we use this GPT-2 for multi-label classification ? multi-label classification is nothing but having multiple label for a single data point/record itself.
00:39:15 Sameer (USA): Are these deep architectures usually the result of extensive trial and error, or do scientists work on a hypothesis when constructing them?
00:40:59 Ernesto(Spain): Hi, I have a question about time series, do you think it makes sense to use it for this kind of data and do you know if this is being used? any information to be documented? thank you very much
00:41:33 Sonam (USA): Does it have any data limitations for the algorithm to work with?
00:42:13 MARIO (USA): Can you demonstrate how GPT-2 works with a real example?
00:46:23 Sameer(USA): Does this simulate a Markov process?
00:48:00 Ernesto(Spain): ok thank you very much. Any documentation to start?
00:50:20 emmanuel (France): Are there Bots using GPT-2 ?
00:50:36 Paula (Spain): how could we use GPT2 for text summarization?
00:51:40 Paula (Spain): any hints, or nice articles I could rely on?
00:52:02 Paula (Spain): im currently working on generating product names from these products' description
00:52:06 Sumedh (India): hey guys..this article is a good starter for NLP model selection for different downstream NLP tasks..
00:54:44 Bright (Canada): Q: Can you compare GPT-2 and ULMFiT in terms of text classification? How long the text such that GPT-2 has a advantage over ULMFiT? Would GPT-2 require more training data than ULMFiT?
00:54:54 Aldo (Czechia): Q: about performance. What kind of latency we should expect for moview review? 10ms, 100ms, ….? is dependent on the text length?
00:56:21 Bright (Canada): Q: Can you compare GPT-2 and ULMFiT in terms of text classification? How long the text such that GPT-2 has a advantage over ULMFiT? Would GPT-2 require more training data than ULMFiT? Can you use GPT-2 for 2x transfer learning like ULMFiT? Would it be better than 1x transfer learning?
00:56:58 che (USA): Could you provide some details on the layer of the embedding sequences, why is the last embedding only used for classification?
01:00:34 Vincent (Canada): Any idea for the maximium length of the sentence?
01:00:57 rudra (USA): Q: what will be the output(positive/negative) of following text -> nice nice, good good, happy happy, great great, positive positive, awesome. This is really bad movie.
01:02:16 Shamsu (Nigeria): Do we need to have access to GPT-2?
01:02:29 Bruno (France): Could we tune batch size in the NB ?
01:03:55 Sh (USA): sorry, I joined late! can one type briefly what is GPT2!?
01:10:17 Stoney (USA): Is it productive to add a third label of 'neutral'?
01:15:40 Viratkumar (India): will this work on Kaggle?
01:17:00 Cyril (France): ...nice to see a so well commented code...congrats!
01:17:11 Rafa (Spain): Can you do the same you are explaining here with GPT-3 instead of GPT-2?
01:18:16 Bright (Canada): Q: What do you prefer on Pytorch frameworks? Raw pytorch vs ignite vs lighting?
01:23:01 Alfonso (Spain): the gpt2 is available on tensorfkow too?
01:23:26 Bruno (France): yes with huggingface lib
01:26:58 AICamp US: Two questions: 1. How does GPT-2 compare against GPT-3 in terms of efficiency?

2. Does it have any data limitations for the algorithm to work with?
01:27:28 Alvaro (USA): Why padding the left as opposed to the right? In translation I use right padding.
01:28:26 james (Canada): Q: will the last embedding always predict the same word then
01:28:44 Stoney (USA): Have you tried three classes {positive, neutral, negative} along with a confidence score [0.0, 1.0] from a softmax?
01:32:21 che (USA): Could you highlight the important parameters used in gpt-2 model?
01:33:13 Bruno (France): Thanks It runs well on colab !
01:44:17 Youssef (France): What is the tensorboard equivalent in pytorch ?
01:46:13 Bright (Canada): Is training accuracy doing it with dropout enabled?
01:50:30 Michael (UK): BERT is an 'attention' model - is GPT2 similar or does it differ in principle - thanks
01:52:42 Sumedh (India): @bright wang...u can try out weights and biases for pytorch
01:52:50 Sameer (USA): Can you comment on the difference between GPT-2 and GPT-3?
01:52:52 Stoney (USA): do you also output a confidence score using a softmax for each class? Have you tried outputing a third class of 'neutral' or only positive and negative
01:53:32 Sumedh (India): Thanks George for this session!
01:53:42 Stoney (USA): Thank you George
01:53:42 Ernesto (Spain): Thank you very much, I enjoyed so much this session :D
01:53:44 George (USA): Thank you for this detailed and thorough session George
01:53:49 Shamsu (Nigeria): Thanks Great session
01:53:52 Paula (Spain): Thank you very much! It really helped!
01:53:54 Musa (USA): This is a great code walk-through. Thanks @George
01:54:03 Enrique (Spain): Thanks
01:54:08 Jonathan (Germany): great presentation. thank you very much
01:54:09 Mathew (USA): thank you for the walk-through!
01:54:09 Aditya (India): thank you George, great session.
01:54:10 Eduardo (NA) (Spain): Thank you very much, nice tutorial!!!
01:54:15 AB: Thanks George, Great event!
01:54:18 Claudiu (Romania): Thank you!
01:54:19 Marina PdeM (USA): Excellent tutorial!
01:54:26 Bruno (): Thanks George
01:54:31 Javier (Spain): A very interesting presentation Could you comment any other transformer ?
01:54:40 Ted (USA): Q: Often times training metrics will be worse than validation metrics if dropout or lossy augmentations are used. I missed a portion of your presentation, so I wasn't sure if that was the case. Do you think that might be what happened to your model?
01:54:51 Claudiu (Romania): Will we get slides and/or recording?
01:55:00 Haneet (Canada): Thank you!
01:55:16 jm (Spain): thanks a lot!!
01:56:50 enzo (France): Thank you very much for this presentation!
01:58:58 AICamp US: thank you all the feedback. feel free to post more questions or speak to ask questions/discussions (you can unmute yourself to speak.
01:59:26 Alfonso (Mexico): Excellent presentation. Thank you so much
02:01:35 Marina (USA): Anyone got an error training the model? ImportError: cannot import name '_png' from 'matplotlib'
02:02:34 Alvaro (USA): Thanks so much for an excellent tutorial.
02:02:48 Mihai (USA): @Marina, it works fine for me
02:02:57 Diana (Peru): Great presentation!. Thanks
02:02:58 che (USA): To improve classification performance, can I add deep learning networks upon gpt-2?
02:03:54 che (USA): It is a great presentation. Thanks a lot for sharing the insights and answering my questions.
02:03:55 Paula (Spain): how could we tune the pretrained model for other purposes?like what parameter or layers?and anything ohter tan classification?
02:04:54 Paula (Spain): thank you!
02:05:19 Jose (UK): Can you use these transformers to images ? do you have any reference ?
02:05:21 Martin (UK): GTG great presentation - lots of stuff for me to think about!
02:06:27 Ted (USA): Really great presentation! Thank you so much.
02:07:07 Youssef (France): Thank you for the presentation. Have a nice day !
02:07:12 Irune Sanchez (Switzerland): Thank you George!
02:07:17 Jo (USA): Thank you
02:07:19 hamid (USA): Very nice George
02:07:24 Bright (Canada): Thanks!!!
02:07:24 Chris (Germany): Thank you George!!
02:07:40 Mihai (USA): When I will receive the slides and video presentation?
02:07:46 Kosh: Thanks George for a wonderful presentation.
02:07:48 che (USA): Thank you
02:08:18 Mihai (USA): @George! Thank you!

AICamp
Автор

how can i mannual test my own data after this code

Engnr
Автор

Shouldn't we add bos and eos tokens on the dataset samples?

georgekokkinakis
Автор

the slides are not available. I wanted to learn from them. PLease make it available. Thanks

namratashivagunde
welcome to shbcf.ru