Integer embeddings in PyTorch

preview_player
Показать описание
In this video, we implement a paper called "Learning Mathematical Properties of Integers". Most notably, we use an LSTM network and an Encyclopedia of integer sequences to train custom integer embeddings. At the same time, we also extract integer sequences from already pretrained models - BERT and GloVe. We then compare how good these embeddings are at encoding mathematical properties of integers (like divisibility by 2 and primality).

00:00 Intro
00:41 Ideas and high level explanation
02:56 Data - On-line encyclopedia of Integer Sequences
03:58 Data - raw download exploration
05:43 CustomDataset - implementation
09:15 CustomDataset - testing it out
11:36 Network - implementation
15:54 Network - testing it out
16:58 Evaluation utilities
19:05 GloVe embeddings parsing
22:09 BERT embeddings parsing
24:32 LSTM training script
30:58 Experiments to be run
31:39 Results: LSTM guess next
33:53 Results: Metrics (TensorBoard)
36:33 Results: Embeddings projections (TensorBoard)
40:05 Outro
Рекомендации по теме
Комментарии
Автор

Hi, I love that your video shows all the coding parts. I think your format is great! For an improvement, I would look into highlighting even more the code and the software architecture. For example, you could hand draw a simple graph that shows the architecture. You would have a little box for the dataset, another box for the torch model and so on. And then, you can "fill the boxes" with the coding parts. Great video!

f-werto
Автор

Great videos. I like your style of writing the code and going through it like this. I also enjoy the documentation that you write.

Regarding the reason, the model is suboptimal in the div5 and div10 cases, my guess is it has to do with class imbalance. When you added sequence length and max restrictions, many of the examples in the data are eliminated, and 5 and 10 sequences probably have many of their members fall outside of your max value.

mohammedelmahgiubi
Автор

Please do jupyter notebook with numpy tricks that you did at the beginning of this video (when analyzing data). It was so informative.

Автор

you could check if you can predict Linear Congruent PRNGs (overflow) and Shift operator based PRNG

ulf
Автор

It could be more informative if you use juoyter for the video. That would let you plot various things like histogram etc.

MariuszWoloszyn