Running a Hugging Face LLM on your laptop

preview_player
Показать описание
In this video, we'll learn how to run a Large Language Model (LLM) from Hugging Face on our own machine.

Other videos showing how to run LLMs on your own machine

Рекомендации по теме
Комментарии
Автор

You explained completely and perfectly without wasting the audience's time! well done

elmino
Автор

Amazing and outstanding. This video and presentation is awesome.

ravirajasekharuni
Автор

Absolutely wonderful video! to the point and well explianed! way to go! thanks a lot!

MitulGarg
Автор

thank you very much for this great explaination

youssefabbas
Автор

This was an extremely informative video. Really appreciate it.

shivamroy
Автор

Thank you very much, you helped me a lot

flaviocorreia
Автор

An API key is not needed if the model is downloaded and run locally.

duhfher
Автор

I am getting an error/ info log from transformers (twice) stating "Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained." The model then generates only a bunch of whitespace, no matter the input. I have followed through your steps and made sure the files were downloaded at the expected location. The behavior occurrs both with and without setting legacy=False.

radoslavkoynov
Автор

If you're getting a wacky error trying to perform `AutoTokenizer.from_pretrained(model_id, legacy=False)`, do pip install protobuf==3.20.1 and restart the jupyter kernel

Cynadyde
Автор

Awesome content, love your channel! Video is very informative and concise, thanks. As a friendly suggestion, you might want to give a couple of secs at the end for the video for slow people like me to hit that well deserved like button :)

viniciustsugi
Автор

Thank you! I finally downloaded a big llama model.. lol 😹

lxthgrz
Автор

We're going to start by opening??? Start by opening what exactly?

trealwilliams
Автор

thanks sir, however I want to know
1- how one can integrate specific set of models (pre-trained) ones in to Rstudio ? so that one can simply run examples on data "proprietary in my case " locally within R
2- is there a way to ask the inference API for tasks different from the typical sentiment classification of text for example "multi-entity tagging", "modalities" ....etc

your input is highly appreciated

mikiallen
Автор

hi, thanks for the video. May I ask what's the meaning of legacy=False when using the pretrained model?

imaginarybuddy
Автор

thanks Mark, very nice video, super clearly put!
could you please suggest, what could be the reason if (when trying to set the wifi off) the output of those lines of code is "ModuleNotFoundError: No module named utils"?

phishic
Автор

I deeply appreciate your video! Although I have a question, does this still works when the model file is a .safetensors or .pth file, not a .bin file? Thank you!

dgl
Автор

I have done as you say, but running the model pipeline is taking forever to work. It still has not worked, please what can I do?

mbikangruth
Автор

So many steps missing in this video...

paulohss
Автор

1:03 - Thanks for this clarification. I'd done quite a bit of Google searching and scouring the Hugging Face website for this information. I found nothing of value. I'm a computer enthusiast / gamer and not a professional machine learning engineer. Since embarking on running an LLM locally on my previous daily use desktop, I've noticed its near impossible to find a model's resource needs. GPT4 says a 7b parameter model would consume about 48 GB memory. I asked it what size model would fit in my 12 GB Nvidia 3060, it said about 3.2 billion. My question for you is, why is it that everyone in this space who seems to offer a model (or talk about them) never includes something like a system requirements descriptor? Is it one of those situations where, if you need to ask, you probably don't have enough resources? Thanks for any insight you can give on this phenomenon.

darylallen
Автор

i personally found disabling your wifi from a jupyter notebook to be bad ass

diln