Mistral: Easiest Way to Fine-Tune on Custom Data

preview_player
Показать описание

In this video, we will learn how to fine-tune Mistral-7B on your own data.We will look at data preparation and preprocessing for getting the best results.

Want to Follow:

Want to Support:

Need Help?

Join this channel to get access to perks:

LINKS:

TIMESTAMPS:
[00:00] Introduction
[01:06] Understanding and Formatting the Dataset
[03:24] Filtering and Structuring the Dataset
[05:45] Transforming the Dataset to single column
[10:00] Preparing for Model Training - Mistral-7B Fine tuning
[14:46] Understanding and Applying LoRa for Training
[18:36] Running the Training Process
[20:11] Storing and Testing the Trained Model
[21:12] Recap and Conclusion

All Interesting Videos:

Рекомендации по теме
Комментарии
Автор

🎯 Key Takeaways for quick navigation:

00:00 🚀 *Overview of Fine-Tuning Mistral 7B on Custom Data*
- Introduction to fine-tuning large language models on task-specific datasets.
- Mention of Mistral 7B as a suitable option for fine-tuning.
- Overview of the video, including data formatting and an alternative for fine-tuning without powerful GPUs.
01:33 📦 *Understanding Instruct V3 Dataset Structure*
- Explanation of the structure of the Instruct V3 dataset.
- Details on the columns in the dataset: prompt, model response, and source.
- Identification of training and test data splits and composition of the dataset.
03:12 🧹 *Filtering and Selecting Data for Fine-Tuning*
- Filtering the dataset to focus on the Dolly Harmful and Harmless data.
- Insight into the Lambda function for dataset filtering.
- Selection of a subset of examples for both training and test sets.
04:57 🔄 *Formatting Data for Training*
- Explanation of the desired prompt template for Mistral 7B fine-tuning.
- Introduction of a Python function to transform the dataset according to the prompt template.
- Application of the prompt template to create a structured dataset.
09:57 🛠️ *Applying Prompt Template to Dataset*
- Application of the prompt template using the create_prompt function.
- Demonstration of how the function transforms the system prompt in a training example.
- Discussion on applying the prompt template to the entire dataset using the Python map function.
11:32 ⚙️ *Introduction to Gradient for Large Language Model Fine-Tuning*
- Introduction to Gradient as a platform for fine-tuning and serving large language models.
- Highlights of Gradient's features, including API serving and infrastructure management.
- Encouragement to explore Gradient for custom fine-tuning needs.
13:47 🔍 *Loading Model for Fine-Tuning with Low Precision*
- Loading the Mistral 7B model for fine-tuning.
- Explanation of using 4-bit precision for model loading to reduce VRAM usage.
- Introduction of the tokenizer for Mistral 7B.
14:57 🤖 *Examining Base Model Response to New Prompt Template*
- Testing Mistral 7B's response to the new prompt template.
- Observation of model behavior and its limitations with the provided instruction.
- Setting the stage for fine-tuning to improve model performance.
19:42 🚂 *Supervised Fine-Tuning Process with Low-Rank Adaptation (LURA)*
- Explanation of the low-rank adaptation (LURA) concept for reducing trainable weights.
- Configuration of LURA parameters and application to the Mistral 7B model.
- Detailed discussion on hyperparameters for the fine-tuning process.
20:52 🧪 *Testing Fine-Tuned Model*
- Creation of a function to test the fine-tuned Mistral 7B model.
- Example input and model response demonstrating successful fine-tuning.
- Considerations for pushing the fine-tuned model to GitHub for future use.
21:47 🎓 *Recap and Future Content*
- Recap of the key steps in fine-tuning Mistral 7B on a custom dataset.
- Encouragement for feedback and suggestions for future content.
- Mention of related videos and resources for further exploration.

Made with HARPA AI

ilianos
Автор

Can you show the before and after performance of the LLM? In other words, for a given prompt, show what the response was before fine tuning and then show the response to the same prompt after fine tuning? Need to see if it makes a difference.

jrfcs
Автор

One of the best video i've seen on fine-tuning. Just love it. Thank you

mouhameddiallo
Автор

Best video on fine-tuning. Thank you so much.

AITbox
Автор

Hi nice video,
I have a question for you,

why are you not using the same prompt template of mistral instruct in orden to not confus the model?

tommyitachi
Автор

TL;DR warning: This post is about this video's value as an instructional tool for job-seeking self-learners. I hope other's find use in my comments,

At the end of your video you say, if you want me to make content like these and go into a lot more details of the training process I would love to do that. " I would love it too." ! Here's why:

Trying to keep up with AI was a self-study learner whose goal is to land a meaningful, well-paying job in the field is next to impossible without well-defined course of action, content, and constraints, much like training a model. I am a paid subscriber/member of your YT channel and have watched many of your videos, all well-done, but this one stood out for me as the perfect template for accelerated learning in today's fast-moving, ever-changing environment.

I am an instructional designer by formal trading, specializing in accessible and accelerated learning, and I think this video's content (amount and type information), the visible instruction (what and how you choose to say, do and show on screen) and supplemental material (the notes and links) are exceptional, highly effective and efficient. It was this video that made compelled me to buy you coffee each moth. You're going to need it.

CoreyAlejandro
Автор

1:45
Data set structure

2:52
download the data set


5:48
Combine the prompt and model response


6:39
Create prompt function
重點

薇季芬
Автор

Why have you chosen to generate the question alone as opposed to answering? If I finetune to generate answers, what would I be missing?

vaibhavsaxena
Автор

Hi, according to the video, may I ask one question:

when I executed this: `instruct_tune_dataset =
there is an error: TypeError: Provided `function` which is applied to all elements of table returns a variable of type <class 'str'>. Make sure provided `function` returns a variable of type `dict` (or a pyarrow table) to update the dataset or `None` if you are only interested in side effects.

so I adjusted the function create_prompt() to return like this: return {'prompt': full_prompt}, then re-run the cell of ` instruct_tune_dataset = instruct_tune_dataset.map(create_prompt)`, it works.

but why there is no problem to set for SFTTrainer for training when create_prompt() only `return full_prompt`, the training process succeed.

beananonymous
Автор

Thank you for the helpful content. I have a question: If we want to fine-tune for a chat model instead of instruct, should we change the training prompt to answer the question rather than generate the instruction?

unclecode
Автор

can you go through how to fine tune a dataset on mistral for classification problems?thanks

wy
Автор

I have a question about the tokenizer used in the tutorial. Why is "mistralai/Mistral-7B-v0.1" used instead of By the way, the model itself uses Thanks.

jiehuali
Автор

Can you please explain why your structure is better than the one default - or any other structure, for that matter. Thx.

malikrumi
Автор

yes please go into a lot more detail of training/fine-tuning

samsonthomas
Автор

@engineerprompt What's the cost to train using the small sample from the video?

marcusm
Автор

Very useful; You might want to train it for at least a full epoch otherwise you will not pass through all the training data.

jmirodg
Автор

Awesome video! Thanks for sharing!
May I ask if there will be any more videos related to the use of the Mistral model at RAG in the future?

MikewasG
Автор

Excellent video.What i am missing is after finetune gow can you upload mistral plus finetuned files i my your oersonal folder in huggingface? Also how can you interact with the midel plus finetuned data using hf text inference?

myrulezzz
Автор

Please create a video on fine tuning MoE LLM using LoRa adapters such as Mixtural 8x7B MoE LLM

suleimanshehu
Автор

Any reason why did you not finetune the base mistral (not instruct) model?

manabchetia