HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

preview_player
Показать описание
chris looks under the hood of huggingface models such as TinyLlama and Mistral 7-B. In the video Chris presents a high level reference model of large language models and uses this to show how tokenization and the AutoTokenizer module works from the HuggingFace transfomer library linking it back to the HuggingFace repository. In addition we look at the tokenizer config and Chris shows how Mistral and Llama-2 both use the same tokenizer and embeddings architecture (albeit different vocabularies). Finally Chris shows you how to look at the model configuration and model architecture of hugging face models.

As we start to build towards our own large language model, understanding these fundamentals are critical no matter whether you are a builder or consumer of AI.

Google Colab:
Рекомендации по теме
Комментарии
Автор

The best video I've watched on Youtube about LLM so far. You explain complex topics in an accessible language, clearly and understandably. You are doing a very good job. I'm eagerly waiting for the next videos :)

ukaszrozewicz
Автор

Amazing way to get people comfortable with the model architecture. Thank you so much for sharing your knowledge.

Jaypatel
Автор

Great video. Just the right amount of detail. Thanks.

tekperson
Автор

Thanks, for another great video Chris. I've been through some LLM courses on Udemy but your channel is helping me to clear many doubts I have on the whole thing. I'm glad I found your channel. It's really the best on this subject. Congratulations. Marcelo.

msssouza
Автор

Thanks being so so in details. That was really a refresher for me. Glad someone like you is doing such a good work.

atifsaeedkhan
Автор

Excellent explanation. Although I don't have a use case to fine-tune a model currently, I presume I will eventually it'll be great to have what you've shared in my back pocket. Thanks a bunch.

kenchang
Автор

Great video! Looking forward to your next videos…

janstrunk
Автор

Looking really forward to the next video.

BipinRimal
Автор

For the very first time, I finally get it, thanks to you. Thank you for your service to the community.

wadejohnson
Автор

Thank you! This video brings light into the black box of LLM magic)

ilyanemihin
Автор

Excellent tutorial to get started with LLMs.

BiranchiNarayanNayak
Автор

How does the tokenizer decode sub-word embeddings? Specifically, how do you determine which sequence is concatenated into a word vs. standing on its own? As shown, the answer would be decoded with spaces between the embeddings, which wouldn't make "Lovelace" into a word.

john
Автор

Great job sir, one video for me sir how to build llama APIs i want use my train own model now i want using in my website ..

tec-earning
Автор

Bro, just turn-on the Big Thank so I can donate you

huiwencheng