Pytorch RNN example (Recurrent Neural Network)

preview_player
Показать описание
In this video we go through how to code a simple rnn, gru and lstm example. Focus is on the architecture itself rather than the data etc. and we use the simple MNIST dataset for this example.

❤️ Support the channel ❤️

Paid Courses I recommend for learning (affiliate links, no extra cost for you):

✨ Free Resources that are great:

💻 My Deep Learning Setup and Recording Setup:

GitHub Repository:

✅ One-Time Donations:

▶️ You Can Connect with me on:
Рекомендации по теме
Комментарии
Автор

I have been struggling for my master degree. You tutorials really help me a lot. What distinguish your tutorials from others is it's very practical and hands-on. I have learned the basic theory of deep learning, but implementing them is the key! Thanks for your hard work. God bless you!

ChizkiyahuOhayon
Автор

Really enjoy how you leave the theory for other videos and get right to the hands on, thank you!

vaisuliafu
Автор

Never thought of doing image related processing with RNNs xD
Nice tutorial. Thanks. I like this playlist for its clear explanations about the code, and yeah the intro is my favourite <3

arsiveparkour
Автор

may I ask how would you define your input size, and sequence length if you would have word embeddings of num_instances by num_features?

bestest
Автор

Thanks for the tutorial. What I think regarding the LSTM having better performance when only taking the last layer output, is that the LSTM now has the chance to develop and accumulate a good decision, as the question is a classification problem (i.e. many-to-1). That is because the last output is conditioned to ALL the previous states. In the case of including the intermediate states as an input to the FC layer, the accumulated learning will be somehow "partially" phased out by the immature decisions represented in the hidden states if I may say :)

awadelrahman
Автор

Hello Very Nice tutorial, I have a question. I know that Rnn's can take variable length sequences but when it comes to mini batch we should pad them to same length. Why? why can't we have variable length sequences in a mini batches.

nikhilkumar
Автор

Thanks, why in implementation you do not need to specify sequence_length in your architecture ? Is there a specific forme to give in the input to the model in order to let it detects the sequence_length alone ?

anas.k
Автор

Nice Tutorial! However, I have a question. Do you know some references where they combine all time steps for the classification at the end? I've not seen that before, and I'm wondering what's the point? Shouldn't the last time step output be the best predictor anyway?

patloeber
Автор

Does it perform the same if you put a sequence of rows or a sequence of columns as the input ?

zrmsraggot
Автор

Thanks for explanation. I have tried BiLSTM on salami dataset for detectiong boundaries but the f1 score is decreasing after 20 epochs, can you please elaborate how may I fix this overfitting issue using same model?

aneekaazmat
Автор

Hi Aladdin.. The video is the to the point and awesome till the implementation part. I think you could have added a hacky intro to RNN/GRU/LSTM as well, Otherwise I really liked this one.

ashishjohnsonburself
Автор

Why do you take the product of hidden_size and sequence_length as input into nn.Linear() at 6:00?

orjihvy
Автор

Thanks Aladdin. Great tutorial.
By the way i was trying to test my model on individual samples. I realized that it does not matter if the shape of my individual image is (1, 28, 28) or (28, 28), my model accepts it and gives me correct results. Why would be that ? Should not the model reject an image with (28, 28) since it expects this shape: (batch, seq_len, features) ?

somyekathait
Автор

I really like the paper walk through tutorial. I am your patriot. Expect you do deliver more cool stuff.

donkkey
Автор

A slight heads-up for people trying this out themselves -- (for the eagle-eyed observants, never mind).
Using a learning rate of 0.005 for a Vanilla RNN does not have any effect on learning, and you will end up not converging (abysmal accuracy). Use a smaller learning rate for RNN's (0.001) and you can use the default 0.005 for the GRU and LSTM implementations, to replicate the results. Great video nevertheless!

praladprasad
Автор

Hi Aladdin,

Thank you so much for your amazing tutorial videos.
I was wondering that about only using the last hidden state in the lstm, should the code be `self.fc = nn.Linear(sequence_length, num_classes)` rather than `self.fc = nn.Linear(hidden_size, num_classes)`?

Best,
Yu

yuqi
Автор

Can I know why there is torch zeros in the forward method? (the reason) If you have any resource to share it would be good.

joxa
Автор

what i'm always missing are a few inference examples with the final model and the code to do so

holthuizenoemoet
Автор

As per the pytorch documentation the shape of the output of the nn.RNN cell is (seqlength, batchsize, hidden_size)so the reshaping operation should be out.reshape(out.shape[1], -1)

tapaskumarroy
Автор

Hey I have been following your tutorial series, I have a doubt! Why are we getting such overfitting results for just training for 2 epochs. Even though we're using an RNN, that's not suitable for training Image Data!

harjyotbagga