Mistral 8x7B Part 2- Mixtral Updates

Показать описание

Try the model

For more tutorials on using LLMs and building Agents, check out my Patreon:

My Links:

Github:

Рекомендации по теме

Комментарии

This is amazing! I can't wait to give the new version a try. Thanks for sharing this update! You're truly one of the most valuable YouTube channels when it comes to discussing these models because your videos have a lot of technical depth.

DoktorUde

Thank you, great explanation very helpful!

TailorJohnson-ly

Is a marvelous architecture I was on the fence since it did not seem to me to benefit the people who runs local but after that explanation. I am officially Hyped. I wonder why the earlier MoE models weren't well received at all if is the same architecture defenetly Mistral have the secret sauce. And releasing the code for vLLMis just the Chad move thank you for the video. Can't wait to try this on my machine even Finetune it soon.

impactframes

A 👍 for Mistral. Definetly it will be finetuned. We saw that are many high performant LLM's are finetuned based on Mistral 7B. For example Zephyr 7B, Notus etc. I see that the context size is 32K? Look at Linux a very succesfull open source OS and we will see in the LLM world also that open source LLM will be as good or surpass closed LLM from big companies like ClosedAi, Anthropic, Google, Microsoft...

henkhbit

Looks like thebloke just uploaded gguf quantized 8x. Its 18gb for 2bit and needs patched llama cpp

Edit: With the latest mixtral branch of llama.cpp the 2bit model is loading, slowly..

terbospeed

For the performance rival GPT 3.5 with 80GB of VRAM. Now Private corp can run this on their office without worrying about their data safety.

nufh

what would be the dataset which they trained the model ? is it separate dataset for each model ?

venkateshmunagala

Running thebloke instruct quantized 5-bit on cpu. Very tolerable 8 tokens per second. 16 core AMD, ddr5 RAM

theosalmon

No way to use mixtral without a gpu? Even the quantized ones?

gangs

Sir, any advice if I use japanese language?

vinsmokearifka

i can't understand they can train this with "we train experts and routers simultaneously." Can you explain clearly

boybro

Mistral 8x7B Part 2- Mixtral Updates

Mistral 8x7B Part 2- Mixtral Updates

How Did Open Source Catch Up To OpenAI? [Mixtral-8x7B]

Mistral 8x7B Part 1- So What is a Mixture of Experts Model?

This new AI is powerful and uncensored… Let’s run it

NEW Mixtral 8x22b Tested - Mistral's New Flagship MoE Open-Source Model

Mixtral 8x7B DESTROYS Other Models (MoE = AGI?)

How To Install Uncensored Mixtral Locally For FREE! (EASY)

Mixtral 8x7B: New Mistral Model IS INSANE! 8x BETTER Than Before - Beats GPT-4/Llama 2

The NEW Mixtral 8X7B Paper is GENIUS!!!

Mixtral of Experts (Paper Explained)

Mistral AI (Mixtral-8x7B): Performance, Benchmarks

Fine-Tune Mixtral 8x7B (Mistral's Mixture of Experts MoE) Model - Walkthrough Guide

Dolphin 2.5 🐬 Fully UNLEASHED Mixtral 8x7B - How To and Installation

New Open Source LLM Mixtral 8x7B Released by Mistral AI | GenAI News CW50 #aigenerated

You're Prompting Mistral WRONG!

How To Run Mistral 8x7B LLM AI RIGHT NOW! (nVidia and Apple M1)

Mixtral 8X7B Crazy Fast Inference Speed

Easiest Installation of Mixtral 8X7B

Mixtral 8x7B is AMAZING: Know how it's Beating GPT-3.5 & Llama 2 70B!

Exploring Mixtral 8x7b: A Powerful and Uncensored AI

New AI MIXTRAL 8x7B Beats Llama 2 and GPT 3 5

LangChain 06: Prompt Template Langchain | Mistral AI | Mixtral 8x7B| Python | LangChain

Exploring Mixtral 8x7B: Mixture of Experts - The Key to Elevating LLMs

Mistral MoE - Better than ChatGPT?