Running a Hugging Face LLM on your laptop

Показать описание

In this video, we'll learn how to run a Large Language Model (LLM) from Hugging Face on our own machine.

Other videos showing how to run LLMs on your own machine

Learn Data with Mark

Рекомендации по теме

Комментарии

You explained completely and perfectly without wasting the audience's time! well done

elmino

Amazing and outstanding. This video and presentation is awesome.

ravirajasekharuni

Absolutely wonderful video! to the point and well explianed! way to go! thanks a lot!

MitulGarg

thank you very much for this great explaination

youssefabbas

This was an extremely informative video. Really appreciate it.

shivamroy

Thank you very much, you helped me a lot

flaviocorreia

An API key is not needed if the model is downloaded and run locally.

duhfher

I am getting an error/ info log from transformers (twice) stating "Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained." The model then generates only a bunch of whitespace, no matter the input. I have followed through your steps and made sure the files were downloaded at the expected location. The behavior occurrs both with and without setting legacy=False.

radoslavkoynov

If you're getting a wacky error trying to perform `AutoTokenizer.from_pretrained(model_id, legacy=False)`, do pip install protobuf==3.20.1 and restart the jupyter kernel

Cynadyde

Awesome content, love your channel! Video is very informative and concise, thanks. As a friendly suggestion, you might want to give a couple of secs at the end for the video for slow people like me to hit that well deserved like button :)

viniciustsugi

Thank you! I finally downloaded a big llama model.. lol 😹

lxthgrz

We're going to start by opening??? Start by opening what exactly?

trealwilliams

thanks sir, however I want to know
1- how one can integrate specific set of models (pre-trained) ones in to Rstudio ? so that one can simply run examples on data "proprietary in my case " locally within R
2- is there a way to ask the inference API for tasks different from the typical sentiment classification of text for example "multi-entity tagging", "modalities" ....etc

your input is highly appreciated

mikiallen

hi, thanks for the video. May I ask what's the meaning of legacy=False when using the pretrained model?

imaginarybuddy

thanks Mark, very nice video, super clearly put!
could you please suggest, what could be the reason if (when trying to set the wifi off) the output of those lines of code is "ModuleNotFoundError: No module named utils"?

phishic

I deeply appreciate your video! Although I have a question, does this still works when the model file is a .safetensors or .pth file, not a .bin file? Thank you!

dgl

I have done as you say, but running the model pipeline is taking forever to work. It still has not worked, please what can I do?

mbikangruth

So many steps missing in this video...

paulohss

1:03 - Thanks for this clarification. I'd done quite a bit of Google searching and scouring the Hugging Face website for this information. I found nothing of value. I'm a computer enthusiast / gamer and not a professional machine learning engineer. Since embarking on running an LLM locally on my previous daily use desktop, I've noticed its near impossible to find a model's resource needs. GPT4 says a 7b parameter model would consume about 48 GB memory. I asked it what size model would fit in my 12 GB Nvidia 3060, it said about 3.2 billion. My question for you is, why is it that everyone in this space who seems to offer a model (or talk about them) never includes something like a system requirements descriptor? Is it one of those situations where, if you need to ask, you probably don't have enough resources? Thanks for any insight you can give on this phenomenon.

darylallen

i personally found disabling your wifi from a jupyter notebook to be bad ass

diln

Running a Hugging Face LLM on your laptop

Running a Hugging Face LLM on your laptop

LangChain - Using Hugging Face Models locally (code walkthrough)

LangChain: Run Language Models Locally - Hugging Face Models

Getting Started With Hugging Face in 15 Minutes | Transformers, Pipeline, Tokenizer, Models

HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

All You Need To Know About Running LLMs Locally

6 Ways to Run ChatGPT Alternatives in Your Machine (Including Llama3)

Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps

Hugging Face Tutorial (2024) - Sentiment Analysis, Text Generation, LLM

How to run Large AI Models from Hugging Face on Single GPU without OOM

#1-Getting Started Building Generative AI Using HuggingFace Open Source Models And Langchain

Hugging Face GGUF Models locally with Ollama

An LLM journey speed run: Going from Hugging Face to Vertex AI

Running Gemma using HuggingFace Transformers or Ollama

How To Use Meta Llama3 With Huggingface And Ollama

Run your own AI (but private)

Run a LLM on your WINDOWS PC | Convert Hugging face model to GGUF | Quantization | GGUF

Run 3 Open-Source LLMs on Google Colab - for FREE ⚡️ Top Generative AI Model Hands-on (Hugging Face)...

How to Download Models on Hugging Face 2024?

Importing Open Source Models to Ollama

Hugging Face SafeTensors LLMs in Ollama

Accelerate Big Model Inference: How Does it Work?

Deploying a Deep Learning Model using Hugging Face Spaces and Gradio

What is Hugging Face? (In about a minute)