LLAMA-2 🦙: EASIET WAY To FINE-TUNE ON YOUR DATA 🙌

preview_player
Показать описание
In this video, I will show you the easiest way to fine-tune the Llama-2 model on your own data using the auto train-advanced package from HuggingFace.

Steps to follow:
---installation of packages:
!pip install autotrain-advanced
!pip install huggingface_hub

!autotrain setup --update-torch (optional - needed for Google Colab)

---- HuggingFace credentials:
from huggingface_hub import notebook_login
notebook_login()

--- single line command!
!autotrain llm --train --project_name your_project_name --model TinyPixel/Llama-2-7B-bf16-sharded --data_path your_data_set --use_peft --use_int4 --learning_rate 2e-4 --train_batch_size 2 --num_train_epochs 3 --trainer sft --model_max_length 2048 --push_to_hub --repo_id your_repo_id -

⏱️ Timestamps
Intro: [00:00]
Auto-train & installation: [00:17]
Fine-tuning - One Liner: [02:00]
Data Set Format: [05:30]
Training settings: [08:26]

LINKS:

All Interesting Videos:

#llama #finetune #llama2 #artificialintelligence #tutorial #stepbystep #llm #largelanguagemodels #largelanguagemodel
Рекомендации по теме
Комментарии
Автор

Superb tutorial by its clarity, simplicity and to the point...big Thank you! NOTE Bugfix : replace the underscore with corresponding dash to make the autotrain command run on colab

christianmboula
Автор

I was in the hospital because my lung collapsed and I've been having a seriously rough go at it lately (life long issues with fam, etc), so I really appreciate this video. Thanks for all your hard work. Researching these topics and understanding them is no small feat. Keep it up.

teleprint-me
Автор

Very disappointed you didn't show this actually doing anything. How to verify or test if its working. I can run a script and have it do nothing... How do we see it actually worked or test it.

LainRacing
Автор

Thanks SO MUCH brother! You are a true hero! Fine tuning is the most important part of OS llms. That's where the value/wealth is hidden. I cannot wait for your following fine-tuning video.🙏🙏

samcavalera
Автор

Hi, thanks for the video, could you explain in detail how to load the model and create an inference api in the local machine? that would be really helpful. thanks in advance

karthigeyan
Автор

would be great to have a colab notebook for this that included inference on the finished pushed model

photojeremy
Автор

Thank you very much champion! We are getting to the true spirit of open source, allowing science to be truly scalable for the public and public interests.

jersainpasaran
Автор

I was initially skeptical but this was an excellent short tutorial. Thanks!

garyhuntress
Автор

One of the best video I have come across. I will definitely share this channel with my colleagues and friends who wants to learn more on this topic.

arjunv
Автор

Please make a video for creating your own dataset and actually using the model

anjakuzev
Автор

Hi, the way you are explaining is very positive !!!! One solution am not getting is If I want to train my custom data on regional languages how to proceed can you share your knowledge on this. Which model is best on this and if we pass the Prompt in English will it gets converted to regional language and generates the ouput?

dr.aravindacvnmamit
Автор

how can i incorporate my own data into the 'assistant' fine tune? for example, a 100 page document about a company product. do i format it into the something similar to what's in the openassistant dataset and add it to the dataset? or finetuning on own data will be another finetuning step? i.e. after finetuning on the openassistant dataset, i need to run another finetune for my own data? cheers and thanks for all your hardwork to share your knowledge to us!

bagamanocnon
Автор

Thank you very much!
Looking forward to the dataset preparation video :)

bardaiart
Автор

some of my friends who followed this tutorial mentioned they see an argument issue. I think it is because of the command being broken down into multiple lines. Running the command in multiple lines requires a '\' to be added at the end of every line. Final command should look like this

!autotrain llm --train --project_name '<project_name>' \
--model \
--data_path \
--text_column text \
--use_peft \
--use_int4 \
--learning_rate 2e-4 \
--train_batch_size 2 \
--num_train_epochs 3 \
--trainer sft \
--model_max_length 2048 \
--push_to_hub \
--repo_id <repo_id>/<project_name>'t \
--block_size 2048 > training.log &

arjunv
Автор

Wow, just what I needed. I just put together a Flan Orca style dataset, I cant wait to try in Colab! Thank you for your hard work.

PickleYard
Автор

how to use this trained model?
can you please make video on this?

Yash-mktc
Автор

Great video thank you! I have a question; I have a prompt, an output from a model, and a desired output, how I can format this data, please?

SarahAmelia-lt
Автор

How to save the fine tuned model to local disk instead of pushing to hub. Could you show us the model pushed to hub? These video graphs will make it clearer. Great.

caiyu
Автор

The major work looks to be in making your dataset properly. Which is pretty common. Do you have or are you planning another video that is for training models simply by handing it a lot of files of say web content or better still the raw urls and perhaps something like tags and such? In other words how to add to unsupervised learning from a corpus.

serenditymuse
Автор

Thank you for the video, I am looking forward video about how to prepare our own dataset without using huggingface dataset !!

krishnareddy