Llama 3 - 8B & 70B Deep Dive

Показать описание

Meta AI has released Llama-3 in 2 sizes an *b and 70B. In this video I go through the various stats, benchmarks and info and show you how you can get the model running. As always the Colab is in the description.

🕵️ Interested in building LLM Agents? Fill out the form below

👨‍💻Github:

⏱️Time Stamps:
00:00 Intro
00:35 Meta AI Blog: Llama 3
01:47 Llama 3 Model Card: 8B and 70B
04:25 Intended Use Cases
05:06 Cloud Providers available for Llama 3
05:32 Llama 3 Benchmarks
08:59 Scaling up Pre-training
09:58 Downloading Llama 3 on Hugging Face
10:21 License Conditions
12:44 Llama 3 405B Model: Sneak Peek
14:30 Code Time: Ollama
15:44 Llama 3 on Hugging Chat
16:00 Different Options on Deploying Llama 3
16:30 Llama 3 on Together AI
16:56 Llama 3 on Colab

Рекомендации по теме

Комментарии

I appreciate the factual, no-hype tone. I liked seeing your prompts as a sort of proof of research. Subscribed to bring up the quality of my feed around AI.

seespacelabs

Thanks for the excellent introduction. Can't wait to give it a drive...

walterpark

I noticed that when I asked the model to create a story it wrote a chapter for the story and then after each message it asked “Would you like me to continue with the story?” And just with simple confirmation I could continue. And it seemed to work brilliantly and only after hitting the token limit did the story of course lose quality (forgetting characters etc..). I didn’t do any special prompt so this seemed like a trained thing and it worked awesome!

Normally when you want to keep going writing stories many other models need to be reminded or have to copy-paste the previous story for them to figure out you want to continue the process.

venim

@samwitteveenai, I noticed you're using a custom runtime. Do you have a video tutorial on customizing a capable GPU for running training on Llama without using the quantized version? I configured a custom T4 on GCP to use in Colab, but it seems to be limited to 15GB of RAM for the GPU.

melchhepta

Do we have any idea what non-english languages are supported for llama3?

Recluse

Hi Sam, thanks for this one. Can you share what type of specs be needed for a computer that needs to run Llama 70b locally with a decent performance for multiple(~5 users) concurrently.

nqaiser

Be nice to see how this behaves with local data on local machines. For stuff we need to do with our specific stuff.

morespinach

I asked the model that if it could work completely offline and it responded though it can it would lose touch with the training data and shut down. Did anyone else see this?

clvnegu

Thank you Sam, as always you were amazingly informative and interesting.
I already tried 8b-instruct-q5_K_M directly from ollama, the chat session is terrible and the model spits out training data like a train of words.
will try the the default one (latest) to see if any good comes out.

stawils

70B variant fits in an RTX A6000 with bitsnbytes quantization. Yet to try HF Chat UI but works well with TGI.

iainattwater

This is not really a deep dive sadly. Just more info. Was hoping to see some actual code and performance in terms of accuracy of outcomes.

morespinach

The context window is really low compared to other models.

It should be fine for a lot of tasks but still I’m surprised there was no improvement in that regard.

theworddoner

so far when using groq api with llama3 it seems to use json tool functions easier and understand their assignments and roles better, which then produces better quality code/responses/tool usage.

drlordbasil

i have no idea what is going on, i fell off on AI race.
i can't understand benchmark what does shot 5 mean?

NormTurtle

Does 15 trillion tokens take into account MULTIPLE EPOCHS? There is confusion about it. The old Pile, for example, is only 750 billion tokens.

pensiveintrovert

Frente al millón de tokens de Gemini 1.5, están muy lejos, me imagino que debe haber mucho uso de memoria por esos modelos, pero el hecho de que sea open source, es un gran regalo.

adriintoborf

A bit let down that you immediately go to meta's instruct fine-tune and never compare base model capabilities. This 8b rivals Mixtral 8x7b!! But moreover, developers are cheating themselves, only knowing how to use chatbots, and if nobody learns the value, then we seriously may see companies only release chatbot models in the future!! :(

DaeOh

well, ask a question in Estonian slang ... an you'll see how "large" those language models are ..
indo-europan vs uralic is first thing ting that throws LLM out of kilter .. other different language structures too i think .. but i'm not familiar with other language forks to judge ...

matikaevur

chinchilla optimal means for a given amount of tokens there is an optimal amount of parameters. this does NOT mean that vice versa there is an optimal token amount for x parameters. in fact there is no limit, no amount of tokens is the maximum. this is a very deep misunderstanding present in even the research community and kind of annoying if you ask me

JazevoAudiosurf

When is the not too distant future? Next Sunday A.D.?

erikjohnson

Llama 3 - 8B & 70B Deep Dive

Llama 3 8B: BIG Step for Local AI Agents! - Full Tutorial (Build Your Own Tools)

Llama 3 - 8B & 70B Deep Dive

How Did Llama-3 Beat Models x200 Its Size?

Llama 3.1 405b Deep Dive | The Best LLM is now Open Source

LLaMA 3 “Hyper Speed” is INSANE! (Best Version Yet)

LLaMA 3 UNCENSORED 🥸 It Answers ANY Question

LLaMA 3 Tested!! Yes, It’s REALLY That GREAT

Llama 3.1 Is A Huge Leap Forward for AI

Llama 3 Groq 8B Tool Use - Install and Do Actual Function Calling Locally

37% Better Output with 15 Lines of Code - Llama 3 8B (Ollama) & 70B (Groq)

Llama 3.1 | Meta is leading Open Source AI

How to Download Llama 3 Models (8 Easy Ways to access Llama-3)!!!!

This Llama 3 is powerful and uncensored, let’s run it

How To Run Llama 3 8B, 70B Models On Your Laptop (Free)

Meta Llama 3.1 405B Released! Did it Pass the Coding Test?

Llama 3 on Your Local Computer | Free GPT-4 Alternative

Introducing Llama 3.1: Meta's most capable models to date

Meta Llama 3.1 - Easiest Local Installation - Step-by-Step Testing

🐬 Dolphin-2.9-llama3-8b 🐬 TESTED: Llama3 Finetunes are already Incredible!

BREAKING: LLaMA 405b is here! Open-source is now FRONTIER!

Meta's LLAMA 3 with Hugging Face - Hands-on Guide | Generative AI | LLAMA 3 | LLM

Build Anything with Llama 3 Agents, Here’s How

LLMs with 8GB / 16GB

why llama-3-8B is 8 billion parameters instead of 7?