Fine Tune a model with MLX for Ollama

preview_player
Показать описание
Unlock the secrets of AI model fine-tuning in this easy-to-follow guide! Learn how to:

• Customize AI responses without complex coding
• Create your own dataset for personalized results
• Fine-tune Mistral using MLX on Apple Silicon
• Implement your fine-tuned model with Ollama

Discover why fine-tuning isn't as daunting as it seems, and how you can tweak AI models to match your unique style. Perfect for beginners and those intimidated by traditional Python notebook tutorials.

Don't miss this opportunity to level up your AI skills and create models that truly understand you!

#AITutorial #MachineLearning #FineTuning #MistralAI #AppleSilicon
t

(they have a pretty url because they are paying at least $100 per month for Discord. You help get more viewers to this channel and I can afford that too.)

Join this channel to get access to perks:

00:00 - AI is Amazing
00:26 - Two approaches to tweaking models
00:36 - What is fine tuning
00:57 - Why is it hard to get started
01:12 - The biggest problem
01:39 - Just 3 steps
01:52 - The hardest part
02:09 - Start with step 1
02:27 - How to figure out what to do
03:51 - My first fine tune
04:44 - What to put where
04:58 - Move on to the next step
05:31 - Huggingface login
05:48 - The mlx command
06:43 - The results
06:56 - Define the new model
07:32 - A couple of gotchas
Рекомендации по теме
Комментарии
Автор

I am doing tech for 20 years You Sir are an excellent teacher. Pointing to the documentation and providing us pointers to nice tools to stitch it all together in comprehensive way.

GeertTheys
Автор

I love the way Matt explains things in a way that is both detailed and yet really easy to understand. Thank you man.

umutcelenli
Автор

Nicely done! It's worth noting that what Matt demonstrated is "Fine-Tuning with LoRA" and not Actual Fine Tuning. Low rank adaptation (LoRA) makes customising a model more accessible than actual fine tuning of a model by "freezing" the original weights and training a small subset of parameters.

Actual Fine-Tuning: Adjusts all parameters, requires significant resources, but yields high-quality results.

Low rank adaptation (LoRA): Trains fewer parameters using low-rank matrices, reducing memory and compute needs while maintaining quality.

davidteren
Автор

I'm glad that ollama went so far ahead and how creating a standards for open-source LLMs, like dockerfile-like specification files and so on.

eck
Автор

Thanks great breakdown of the process!

A note about JSONL not being an array. It can be processed by old school unix tools like awk, grep, sed - and used in streaming data with unix pipes where lines are the delimiters. These tools don't do well with json array syntax on large datasets.

cwvhogue
Автор

Sorry if unrelated, am I the only one who thinks that Matt’s voice has that soothing-gentle-teacher like voice? Like I can hear him narrate for a natgeo documentary

hugogreg-hfzl
Автор

Matt, your energy is so calm. I did fine tune with MLX, but I freaked myself out with all the steps and feel like it’s hard to do again.

When you explain it so nicely, my fear goes away and I’m ready again.

You’re spot on about the data prep is the “dark arts”. So true!!

tsomerville
Автор

Love it, thanks for sharing. It's great to see LLM fine-tuning become increasingly accessible to more people.

counterfeit
Автор

Thanks a lot for this tutorial Matt. It is by far the most straightforward fine tuning tutorial I have ever seen.

fabriai
Автор

i so wish you could explain what loras are an how to do one. thank you for this amazing video i already fee much better

y.m.o
Автор

I was totally agreed with you sir, u are the most easier way to let me learn about mlx in the past 2 month I’m keep finding YouTube for all the information. Thank you so much for the video.

Ec-cbcg
Автор

Thanks I tried the mlx fine tune a few months ago. I think this mlx-lm might be more straightforward.

JunYamog
Автор

Matt, this is utterly awesome and I can't thank you enough. I'd seen the compute resources people were using and the code and gone "that's just too time and money intensive to investigate further".

Now, I just need the script from terminator, a code interpreter and, oooh, 5 minutes?

Don't worry, I'll keep control of it...

tsarse
Автор

This is a great vid! Especially if you're at least a hobbyist.

The best complete layperson onramp I've seen for fine-tuning? Is Cohere. And it is free. After that, dip into more of the dark arts. But to whet folks' whistle I usually point them there. It's their whole biz model. It quickly gets old, but you can whip up a trained, finetuned bot in a half day depending on dataset.

interspacer
Автор

Thanks Matt, your explanations are effective and entertaining.

If you could in a future video, would you dive into more detail about fine-tuning? E.g., why would you want to, how to choose your data, etc. Thank you!

mbottambotta
Автор

Amazing content, I will test it soon 🙏thanks!

VictorCarvalhoTavernari
Автор

Thanks, Matt—super spot-on video as usual. You raised a doubt in my mind: You mentioned that fine-tuning is not suitable for adding new information to the original LLM (perhaps I misunderstood). This leaves me a bit perplexed, and I know it’s a debated issue within the community. I agree with you that the best use of fine-tuning is to personalize the style and tone, rather than being used in the "traditional" way to train older (pre-GPT) models like BERT. However, many people argue that fine-tuning could be an alternative to RAG for injecting specific domain knowledge into the LLM. Personally, I’ve never tried fine-tuning a model due to the costs, especially with cloud-based LLMs. In any case, I think it would be valuable to explore these topics further.

My hope is that fine-tuning could become a native feature in Ollama in the future.

Lastly, it would have been useful to see the fine-tuning JSONL data (at least an example). I have my own answer to your question: why JSONL? It might be because of its line-by-line simplicity in Unix pipe scripting.

solyarisoftware
Автор

Thanks for the great video! Could you use an llm to generate question answer pairs for the dataset out of basic text or documents? Would be interested in such a video!

mojitoism
Автор

Thanks a lot for your fantastic videos. I'm actually using Unsloth to fine tune Llama3 for a text classification task. I'll be happy if you'll upload a video for such purposes

bigbena
Автор

dark arts, lol. Love your vids, man.

myronkoch