Pytorch Seq2Seq Tutorial for Machine Translation

preview_player
Показать описание
In this tutorial we build a Sequence to Sequence (Seq2Seq) model from scratch and apply it to machine translation on a dataset with German to English sentences, specifically the Multi30k dataset. There was a lot of things to go through and explain so the video is a bit longer than my normal videos, but I really felt I wanted to share my thoughts, explanations and the details of the implementation!

Resources I used and read to learn about Seq2Seq:

Comment on resources:
I think bentrevett on Github is awesome and I was heavily inspired in this video by his Seq2Seq Tutorials and I really recommend checking him out, he puts out a lot of great tutorials on his Github.

❤️ Support the channel ❤️

Paid Courses I recommend for learning (affiliate links, no extra cost for you):

✨ Free Resources that are great:

💻 My Deep Learning Setup and Recording Setup:

GitHub Repository:

✅ One-Time Donations:

▶️ You Can Connect with me on:

OUTLINE:
0:00 - Introduction
1:27 - Imports
2:05 - Data processing using Torchtext
5:55 - Implementation of Encoder
11:02 - Implementation of Decoder
19:43 - Putting it togethor to Seq2Seq
27:57 - Setting up training of the network
41:03 - Fixing Errors
42:18 - Evaluation of the model
49:32 - Ending and Bleu score result
Рекомендации по теме
Комментарии
Автор

Was struggling to make an implementation of this. So so so happy I found out your tutorial. Thanks a lot for making this. Keep up the great work

kaushilkundalia
Автор

This video has been very helpful for me to be able to implement a seq2seq model for a (slightly different) time series forecasting task!! Thanks so very much!!

SAnalyticsModelling
Автор

This is great stuff. Explained like a pro. Could you please create videos on similar lines with slight modifications like: 1. How to use custom data-set 2. How to use basic RNN and/or GRU (I tried but ran into multiple issues). These branch offs will be very helpful in overall understanding on how to modify code to address custom problems. Thanks in advance :)

amitgk
Автор

Will the first output of the model be <SOS> token or not? In the intro you've shown that there is no <SOS> in the output sequence. But @39:52 on line 174, you do output[1:] with the intention to skip the <SOS> token, which is contradictory. Shouldn't loss be comparing the entire output sequence i.e. output[:] with target[1:] ?

MenTaLLyMenTaL
Автор

When you called build_vocab method for german and english? How can pytorch now if you are trying to build vocab for the specific language? Because you just pass train_data at once. Can someone explain? Thank yu

MuhammadFhadli
Автор

loved your tutorial. i have a question though. when we implementing the encoder, you said: "x shape" as (seq_len, N). shouldn't it be as (input_size, seq_len, N), where input_size is the vocabulary? because we one-hot encode each word in the first place

dogkansarac
Автор

Hi, I am from Germany and hell yeah I love this Video!

alexkonopatski
Автор

These tutorials are very helpful. Keep up the good work mate.

vansadiakartik
Автор

when I run the code I have this error:

Traceback (most recent call last):

train_data, valid_data, test_data = Multi30k.splits(
AttributeError: 'function' object has no attribute 'splits'

Process finished with exit code 1

I have search for some solution form the internet but it doesnot work. Could you please take a look at my error? I really appreciate your time.

duongocyen
Автор

I got the following errors while implementing:

1. ImportError: cannot import name translate_sentence, save_checkpoint, load_checkpoint from 'utils'
2. AttributeError: module 'torchtext.nn' has no attribute 'Module'

Has someone else also been going through these errors ? Can anyone please suggest the resolution for these issues ?

AshokSharma-ecso
Автор

awesome video man thx for explaining everything well just a quick question :
you made the forward function of the seq2seq to use target values I mean for training it fine but while predicting we won't have that right? I mean I understood we can basically use a while loop and see if x==<eos> we stop it there but just curious that did u write the model totally again for testing or something like if model.eval() do this. what I was curious was how did u implement that? and I was just wondering if there was a way to write that code in such a way that I won't need to pass the target in the forward function of that decoder. if possible pls make a vid on that testing part of that model.
once again great vid man one of the best explanations I have seen makes me understand not only the concept but also how to implement things you are doing great work

yuvrajkhanna
Автор

ty, this helped me in my hw assignment implementing Seq2Seq!

willieman
Автор

Thanks for your tutorial helped me a lot! But I got
ImportError: cannot import name 'translate_sentence' from 'utils'
Do you have any idea I can solve this problem?

youmeifan
Автор

i can't understand the intuition behind making batch_size the index 1 of shape.
(sequence_len, batch_size, word_size).
pytorch docs say lstm uses this shape until it's mentioned batch_first = True

but it seems confusing to me.
(batch_size, sentence_len, word_size) seems more intuitive.

can anyone explain me the first shape (when batch_first=False)

ankanbasu
Автор

Thanks for this tutorial. I just do a similar project about GNN combined with encoder-decoder architecture, this video helps me a lot.

qiguosun
Автор

Hi, I am not able to load german tokeniser.
OSError: [E050] Can't find model 'de'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

parthchokhra
Автор

I am still stuck at shape of target and output tensor.
Don't you think both have same shape, and we don't need to reshape.
Because if target has shape (N, T, voc_size) and output will also has same shape. Correct if I m wrong

riyajatar
Автор

from utils import translate_sentence, bleu, save_checkpoint, load_checkpoint

ModuleNotFoundError: No module named 'utils'

I am getting an error here

sanju
Автор

Thanks for sharing, i find answer in your video on how to get one single sentence translation result.

dockertutorial
Автор

It is amazing!! I'm learning NLP and AI and your videos just perfectly solve my problems.

tianyiwang