Fine-Tuning Meta's Llama 3 8B for IMPRESSIVE Deployment on Edge Devices - OUTSTANDING Results!

Показать описание

This video demonstrates an innovative workflow that combines Meta's open-weight Llama 3 8B model with efficient fine-tuning techniques (LoRA and PEFT) to deploy highly capable AI on resource-constrained devices.

We start by using a 4-bit quantized version of the Llama 3 8B model and fine-tune it on a custom dataset. The fine-tuned model is then exported in the GGUF format, optimized for efficient deployment and inference on edge devices using the GGML library.

Impressively, the fine-tuned Llama 3 8B model accurately recalls and generates responses based on our custom dataset when run locally on a MacBook. This demo highlights the effectiveness of combining quantization, efficient fine-tuning, and optimized inference formats to deploy advanced language AI on everyday devices.

Join us as we explore the potential of fine-tuning and efficiently deploying the Llama 3 8B model on edge devices, making AI more accessible and opening up new possibilities for natural language processing applications.

Be sure to subscribe to stay up-to-date on the latest advances in AI.

My Links

Links:

Рекомендации по теме

Комментарии

So i never post comments, but the way you explained this was by far the best i have seen online, i wish I found your channel 8 months ago :) Please keep posting videos your explanation is very well thought off and put together.

israelcohen

Absolutely fantastic! Really appreciate, detailed, clear breakdown of concrete steps that let us drive value, rather than the clickbait hypetrain that everyone else is on.

ratsock

Big thanks for the detailed walkthrough—really learned a lot from your video!

williammcguire

Thank you! You have a talent for explaining and planning a workshop! Thank you for your work!

petroff_ss

i like the thumbnails, topic types, explains methods, and the mr who explain.
nice channel very valuable infos ❤

talatala

You are amazing! This is the best explanation about this topic. I liked it and just subscribed. Thank you very much !!!

gustavomarquez

Thank you so much for sharing that fantastic clip! It was really informative. I'm currently looking into fine-tuning a model with my ERP system, which handles some pretty complex data. Right now, I'm creating dataframes and using panda-ai for analytics. Could you guide me on how to train and make inferences with this row/column data? I really appreciate your time and help!

RameshBaburbabu

Amazing video, thanks for the best explanation I’ve ever seen on YouTube. Could you also please make a video how to finetune the phi3 model? 🙏

SilentEcho-dq

Nice video. I have a question: At 8:10, is there any reason why you set add_special_tokens=false in the .encode_plus method? I thought special tokens are added during training so wouldn't it make more sense to set add_special_tokens=true if we want to know how large the biggest training example will be?

hellohey

Did you play Chef Slowik in the movie "The Menu"?

Hotboy-qn

I think calling the .for_inference method before training will interfere with training, so it seems like a bad idea. The training in the notebook converges without a problem for me using a T4 GPU if just skipping that step.

hellohey

Is the output from Ollama on your MacBook in real-time? Or you have speed up in the video? On my 2014 iMac, it is significantly slower. It's about time for a new one. What are the technical specifications of your Mac?

SilentEcho-dq

I'm able to get as far as inference, once the model is trained i get an error: name 'FastLanguageModel' is not defined

but thank you for the tutorial

lorenzoplaatjies

what are the parameters need to update, If I using a 1000 question and answer pair in CSV and what will be the value of those parameters.

ganeshkumara

I sir . First i have started with limited data. It works fine . After i added the another 20 data in csv. Not answering correctly. Some other answer it is giving.why?

ganeshkumara

Did you have any experience with fine tuning for non english data on this model, any suggestions for a good multilingual open sources models?🙏

andrew.derevo

rather than using google colab + compute for training, what are your thoughts on using a local machine + GPU?

madhudson

Hi i want to finetune llama3 for English to urdu machine translation can you guide me regarding this.dataset is opus 100

azkarathore

"The A100 works well" You don't say lol -- bruh this is a $50K GPU which costs $2-$3K/month to run.

jonassteinberg

Thank you for this wonderful video, very educative. I have a question, incase I have a dataset with questions and answers but the answers are not written in proper English grammar which at hugging face as What is the best way to make a mode return an answer that is grammatically formatted

ronaldmatovu

Fine-Tuning Meta's Llama 3 8B for IMPRESSIVE Deployment on Edge Devices - OUTSTANDING Results!

Llama 3 Fine Tuning for Dummies (with 16k, 32k,... Context)

Krass: Das kann LLAMA 3 - META AI getestet! So nutzt du LLaMA 3 (deutsche Anleitung)

How Llama 3 Behaves After Fine-Tuning - Install Llama 3 8B Instruct 262k Locally

Meta's LLAMA 3 with Hugging Face - Hands-on Guide | Generative AI | LLAMA 3 | LLM

EASIEST Way to Fine-Tune LLAMA-3.2 and Run it in Ollama

LLaMA 3 Tested!! Yes, It’s REALLY That GREAT

Fine Tune LLaMA 2 In FIVE MINUTES! - 'Perform 10x Better For My Use Case'

How To Use Meta Llama3 With Huggingface And Ollama

Meta Llama 3 Fine tuning, RAG, and Prompt Engineering for Drug Discovery

37% Better Output with 15 Lines of Code - Llama 3 8B (Ollama) & 70B (Groq)

'I want Llama3 to perform 10x with my private knowledge' - Local Agentic RAG w/ llama3

LLaMA 3 “Hyper Speed” is INSANE! (Best Version Yet)

How To FineTune Llama3

Llama 3.1 405b model is HERE | Hardware requirements

How To Fine-Tune Llama 3.1 in 11 Minutes

Finetuning Llama-3.1 8b model Using Unsloth| Model Finetuning Using Unsloth

Metas LLAMA 3 Just STUNNED Everyone! (Open Source GPT-4)

Self-Play Llama-3-8B Finetune Performs Great - Test Locally

Getting Started With Meta Llama 3.2 And its Variants With Groq And Huggingface

'I want Llama3.1 to perform 10x with my private knowledge' - Self learning Local Llama3.1 ...

LLAMA 3 : Explained and Summarised Under 8 Minutes (Compared to Llama 2, Meta AI)

LLaMA 405b Fully Tested - Open-Source WINS!

Llama 3.1 is ACTUALLY really good! (and open source)

How They built The BEST Llama 3.1 FineTune??!!!