The CRAZIEST LLM Fine-Tuning I've seen, And It WORKS!!!

Показать описание

Mistral AI Hackathon winners fine-tuned Mistral 7B to play doom.

Imo, it's the most innovative and craziest LLM fine-tuning I've ever seen.

This video dives into the building of Mistral dooM!

🔗 Links 🔗

Ref 3 -

❤️ If you want to support the channel ❤️
Support here:

🧭 Follow me on 🧭

Рекомендации по теме

Комментарии

It is a meme: We reached AGI, but does it run Doom?

bourdainedepiment

"If you are very old at this point." I take offense to this statement. I'm 45...I didn't think I was old...until now.

king

Hmm. Looks like a fantastic test case for Groq’s increased token output.

pstefan

So cool! I had the idea to do it with Cataclysm: dark days ahead !!! I'm sure many people had this idea but it is the first time I see someone do it.

Imagine replacing NPCs in Cataclysm or Dwarf Fortress by LLMs :) That's what I would like to see next...

sinitarium

This means we can make every LLM multimodal.

thepabli

we are truly living in the most interesting of times indeed XD
getting a model to play and survive in Dwarf Fortress is obvs the next logical step, especially considering the base game was only ASCII

CYIERPUNK

Yea, that is pretty deep. Thanks for the video.

marcfruchtman

Damn. I was going to write a paper on this, and I had started setting up. Didn't know if it would actually work.

thedoctor

"if you are ... very old..." ...

bstbuddy

I remember getting “Doom” confused with “Dune” when I was growing up.

lun

I think the ASCII representation is an overhead map state. not the frame itself. that way it's easy for the LLM to move because it understands where it is in the level space at every turn.

s_the_first

I made a data set that trains the LLM the think before speaking and works amazing with mistral.02. I made another one inspired by this where the ai is encouraged to make a “mental image” or ascii art for each prompt

spencerfunk

So, this method calls for some ideas :))

muhammadrezahaghiri

It's like giving someone sight through electrodes on their tongue!

soccerkenshin

Man... Where do people get these ideas? 😅

faaz

ok legit question, How the hell do you deal with tokenizers

BhaswataChoudhury

It's not news, people have been generating images with llm since gpt2. And model has at least seen ascii and base64 images in the internet scrapped training data

timmygilbert

I wonder if it simply maps out the layout based on all the limited options and also comprehends like AI image enhancers.

ThankYouESM

If it exist Doom will be run on it or now by it, I guess. Things have become very strange.

SiCSpiT

Why do this work ? Because LLM are predictive model, they finetune it on 2048 characters paired with an action. LLama see a text and predict an action.
Now the challenge would be to augment the resolution and types of characters used without exploding the context window.
I remember reading somewhere that LLM can handle compressed text with a dictionary.

vincentvoillot

The CRAZIEST LLM Fine-Tuning I've seen, And It WORKS!!!

The CRAZIEST LLM Fine-Tuning I've seen, And It WORKS!!!

RAG vs. Fine Tuning

Fine Tuning LLM Models – Generative AI Course

EASIEST Way to Fine-Tune a LLM and Use It With Ollama

LLM Fine-Tuning 04: Top 10 LLM Fine-Tuning Frameworks for 2025 | Best Tools for Finetuning AI Agents

Fine Tuning Large Language Models with InstructLab

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

How to Fine Tune your own LLM using LoRA (on a CUSTOM dataset!)

19 Tips to Better AI Fine Tuning

LLM Fine-Tuning 05: Fine-Tuning vs. RAG vs. AI Agents — Which Approach Fits Your Use Case?

Finetune LLMs to teach them ANYTHING with Huggingface and Pytorch | Step-by-step tutorial

'okay, but I want Llama 3 for my specific use case' - Here's how

How we accelerated LLM fine-tuning by 15x in 15 days

Fine Tune a model with MLX for Ollama

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Multi GPU Fine Tuning of LLM using DeepSpeed and Accelerate

Everything you need to know about Fine-tuning and Merging LLMs: Maxime Labonne

Fine-tuning ChatGPT with OpenAI Tutorial - [Customize a model for your application in 12 Minutes]

Local LLM Fine-tuning on Mac (M1 16GB)

Deepseek R1 Fine Tuning [ How to Fine Tune LLM ] Parameter Efficient Fine Tuning LORA Unsloth Ollama

Level Up Your AI Agents with Fine-Tuning (n8n)

Mastering LLM Fine-Tuning: Boost Performance with Hugging Face & LoRA

EASIEST Way to Train LLM Train w/ unsloth (2x faster with 70% less GPU memory required)

Prompt Engineering Vs Fine-Tuning in LLMs