filmov
tv
Fine-Tuning Mistral 7B using QLoRA and PEFT on Unstructured Scraped Text Data | Making it Evil?

Показать описание
#llm #generativeai #machinelearning
Can you train new or forbidden knowledge into a LLM? Let's fine out as I throw 1 gigabyte of scraped, cleaned, plaintext KiwiFarms posts at Mistral 7B. I go over my experience fine-tuning Mistral 7B on a few large datasets of scraped text data including English language song lyrics, and a huge KiwiFarms post dataset.
Training script and video resources are linked below.
[00:00] Introduction/Topics
[02:00] Tools for Bulk Text Extraction
[02:45] Model Choice: Mistral 7b
[03:20] Fine tuning using QLoRA
[04:10] Talking about the linked article, compare/contrast with my training experiences
[06:20] Training script used
[10:45] Merge LoRA script
[11:30] Testing the model with the LM Evaluation Harness
[13:00] Esoterically evaluating the LoRAs with the WebUI/What can be expected from crude raw text training
[15:00] Test results: Testing knowledge of Internet "celebrities"
[18:10] Test results: Song parody generation
[19:20] Memorization test
ALL LINKS AND NOTEBOOK DOWNLOAD ALSO HERE:
Jupyter Notebook
Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of Experiments by Sebastian Raschka
Can LLMs learn from a single example?
LM Evaluation Harness
Convert with Calibre
Calibre
Unstructured IO
QLoRA
PEFT
Bitsandbytes
Original LongLoRA merge script
OpenLLM Leaderboard
LM Eval Harness example command:
Text Generation WebUI
Can you train new or forbidden knowledge into a LLM? Let's fine out as I throw 1 gigabyte of scraped, cleaned, plaintext KiwiFarms posts at Mistral 7B. I go over my experience fine-tuning Mistral 7B on a few large datasets of scraped text data including English language song lyrics, and a huge KiwiFarms post dataset.
Training script and video resources are linked below.
[00:00] Introduction/Topics
[02:00] Tools for Bulk Text Extraction
[02:45] Model Choice: Mistral 7b
[03:20] Fine tuning using QLoRA
[04:10] Talking about the linked article, compare/contrast with my training experiences
[06:20] Training script used
[10:45] Merge LoRA script
[11:30] Testing the model with the LM Evaluation Harness
[13:00] Esoterically evaluating the LoRAs with the WebUI/What can be expected from crude raw text training
[15:00] Test results: Testing knowledge of Internet "celebrities"
[18:10] Test results: Song parody generation
[19:20] Memorization test
ALL LINKS AND NOTEBOOK DOWNLOAD ALSO HERE:
Jupyter Notebook
Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of Experiments by Sebastian Raschka
Can LLMs learn from a single example?
LM Evaluation Harness
Convert with Calibre
Calibre
Unstructured IO
QLoRA
PEFT
Bitsandbytes
Original LongLoRA merge script
OpenLLM Leaderboard
LM Eval Harness example command:
Text Generation WebUI
Комментарии