Pytorch Seq2Seq Tutorial for Machine Translation

Показать описание

In this tutorial we build a Sequence to Sequence (Seq2Seq) model from scratch and apply it to machine translation on a dataset with German to English sentences, specifically the Multi30k dataset. There was a lot of things to go through and explain so the video is a bit longer than my normal videos, but I really felt I wanted to share my thoughts, explanations and the details of the implementation!

Resources I used and read to learn about Seq2Seq:

Comment on resources:
I think bentrevett on Github is awesome and I was heavily inspired in this video by his Seq2Seq Tutorials and I really recommend checking him out, he puts out a lot of great tutorials on his Github.

❤️ Support the channel ❤️

Paid Courses I recommend for learning (affiliate links, no extra cost for you):

✨ Free Resources that are great:

💻 My Deep Learning Setup and Recording Setup:

GitHub Repository:

✅ One-Time Donations:

▶️ You Can Connect with me on:

OUTLINE:
0:00 - Introduction
1:27 - Imports
2:05 - Data processing using Torchtext
5:55 - Implementation of Encoder
11:02 - Implementation of Decoder
19:43 - Putting it togethor to Seq2Seq
27:57 - Setting up training of the network
41:03 - Fixing Errors
42:18 - Evaluation of the model
49:32 - Ending and Bleu score result

Рекомендации по теме

Комментарии

Was struggling to make an implementation of this. So so so happy I found out your tutorial. Thanks a lot for making this. Keep up the great work

kaushilkundalia

This video has been very helpful for me to be able to implement a seq2seq model for a (slightly different) time series forecasting task!! Thanks so very much!!

SAnalyticsModelling

This is great stuff. Explained like a pro. Could you please create videos on similar lines with slight modifications like: 1. How to use custom data-set 2. How to use basic RNN and/or GRU (I tried but ran into multiple issues). These branch offs will be very helpful in overall understanding on how to modify code to address custom problems. Thanks in advance :)

amitgk

Will the first output of the model be <SOS> token or not? In the intro you've shown that there is no <SOS> in the output sequence. But @39:52 on line 174, you do output[1:] with the intention to skip the <SOS> token, which is contradictory. Shouldn't loss be comparing the entire output sequence i.e. output[:] with target[1:] ?

MenTaLLyMenTaL

When you called build_vocab method for german and english? How can pytorch now if you are trying to build vocab for the specific language? Because you just pass train_data at once. Can someone explain? Thank yu

MuhammadFhadli

loved your tutorial. i have a question though. when we implementing the encoder, you said: "x shape" as (seq_len, N). shouldn't it be as (input_size, seq_len, N), where input_size is the vocabulary? because we one-hot encode each word in the first place

dogkansarac

Hi, I am from Germany and hell yeah I love this Video!

alexkonopatski

These tutorials are very helpful. Keep up the good work mate.

vansadiakartik

when I run the code I have this error:

Traceback (most recent call last):

train_data, valid_data, test_data = Multi30k.splits(
AttributeError: 'function' object has no attribute 'splits'

Process finished with exit code 1

I have search for some solution form the internet but it doesnot work. Could you please take a look at my error? I really appreciate your time.

duongocyen

I got the following errors while implementing:

1. ImportError: cannot import name translate_sentence, save_checkpoint, load_checkpoint from 'utils'
2. AttributeError: module 'torchtext.nn' has no attribute 'Module'

Has someone else also been going through these errors ? Can anyone please suggest the resolution for these issues ?

AshokSharma-ecso

awesome video man thx for explaining everything well just a quick question :
you made the forward function of the seq2seq to use target values I mean for training it fine but while predicting we won't have that right? I mean I understood we can basically use a while loop and see if x==<eos> we stop it there but just curious that did u write the model totally again for testing or something like if model.eval() do this. what I was curious was how did u implement that? and I was just wondering if there was a way to write that code in such a way that I won't need to pass the target in the forward function of that decoder. if possible pls make a vid on that testing part of that model.
once again great vid man one of the best explanations I have seen makes me understand not only the concept but also how to implement things you are doing great work

yuvrajkhanna

ty, this helped me in my hw assignment implementing Seq2Seq!

willieman

Thanks for your tutorial helped me a lot! But I got
ImportError: cannot import name 'translate_sentence' from 'utils'
Do you have any idea I can solve this problem?

youmeifan

i can't understand the intuition behind making batch_size the index 1 of shape.
(sequence_len, batch_size, word_size).
pytorch docs say lstm uses this shape until it's mentioned batch_first = True

but it seems confusing to me.
(batch_size, sentence_len, word_size) seems more intuitive.

can anyone explain me the first shape (when batch_first=False)

ankanbasu

Thanks for this tutorial. I just do a similar project about GNN combined with encoder-decoder architecture, this video helps me a lot.

qiguosun

Hi, I am not able to load german tokeniser.
OSError: [E050] Can't find model 'de'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

parthchokhra

I am still stuck at shape of target and output tensor.
Don't you think both have same shape, and we don't need to reshape.
Because if target has shape (N, T, voc_size) and output will also has same shape. Correct if I m wrong

riyajatar

from utils import translate_sentence, bleu, save_checkpoint, load_checkpoint

ModuleNotFoundError: No module named 'utils'

I am getting an error here

sanju

Thanks for sharing, i find answer in your video on how to get one single sentence translation result.

dockertutorial

It is amazing!! I'm learning NLP and AI and your videos just perfectly solve my problems.

tianyiwang

Pytorch Seq2Seq Tutorial for Machine Translation

Pytorch Seq2Seq Tutorial for Machine Translation

Pytorch Seq2Seq with Attention for Machine Translation

Pytorch Transformers for Machine Translation

Amazon Stock Forecasting in PyTorch with LSTM Neural Network (Time Series Forecasting) | Tutorial 3

Sequence-to-Sequence (seq2seq) Encoder-Decoder Neural Networks, Clearly Explained!!!

L 9 Language Translator using seq2seq Model (RNN)

PyTorch vs TensorFlow | Ishan Misra and Lex Fridman

Using PyTorch to train an encoder-decoder to translate between English and German

Sequence To Sequence Learning With Neural Networks| Encoder And Decoder In-depth Intuition

Hands-On Natural Language Processing with PyTorch : Intro to seq2seq | packtpub.com

Pytorch for Beginners #43 | Transformer Model: Implement EncoderDecoder

Pytorch Image Captioning Tutorial

How to save and load models in Pytorch

Pytorch for Beginners #42 | Transformer Model: Implement Decoder

Pytorch for Beginners #35 | Transformer Model: Encoder Attention Masking

Encoder And Decoder- Neural Machine Learning Language Translation Tutorial With Keras- Deep Learning

Building a Translator with Transformers

Pytorch for Beginners #41 | Transformer Model: Implement Encoder

Tutorial 2- Fine Tuning Pretrained Model On Custom Dataset Using 🤗 Transformer

Sequence Models Complete Course

10. Seq2Seq Models

Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!

Pytorch for Beginners #27 | Transformer Model: Multiheaded Attn-Implementation with In-Depth-Details

L19.2.2 Implementing a Character RNN in PyTorch --Code Example