Meta's new AI is dangerously good

preview_player
Показать описание
Llama 3.1 405B review and testing. Open-source catches up to frontier models
#llama405b #ainews #ai #agi #singularity

Where to use Llama 405B

Here's my equipment, in case you're wondering:

0:00 Fighting closed AI
2:01 Scope of Llama 3.1 models
4:58 How to use
8:32 Testing
27:30 Limitations and context length
29:23 Benchmarks and performance
34:00 Truly open source?
Рекомендации по теме
Комментарии
Автор

Yes, mom, i need an Nvidia h100 rack for my ai school project.

rch
Автор

It is good to see that small companies like Meta can catch up to large and multimillion-dollar companies like OpenAI and Anthropic

ldx
Автор

9:49 “I am going to try and touch myself” 😂 woah there lol

joeysipos
Автор

Open source AI is the only way to deal with Skynet from big techs.

I guess it's the Redemption Ark from Mark, don't you think? XD

Edit: you were right in saying that open source solutions aren't so behind from property software 👌

dariustakeda
Автор

"if I touch myself I lose the game" that's no nut November, not snake

strzaua
Автор

Claude 3.5 sonnet is king 👑 for coding and translate, It is the most powerful model in language currently He can write some ancient Arabic poems with correct meter and rhyme, and in a difficult ancient language

ilyass-alami
Автор

18:03 I just tested this on both LLAMA 3.1 405B and GPT 4o, and both failed it. Moreover, ChatGPT gave me a completely nonsensical explanation on the follow-up question (note that I started a fresh chat for both):

GPT 4o:

Q: which is bigger 9.11 or 9.9?

GPT 4o: 9.11 is bigger than 9.9.

Q: Why?

GPT 4o: To compare 9.11 and 9.9, consider their numerical values in the context of decimal notation:

9.11 can be interpreted as 9.110.
9.9 can be interpreted as 9.900.
When comparing these numbers digit by digit:

Both numbers have 9 in the units place.
In the tenths place, both numbers have a 9.
In the hundredths place, 9.11 has a 1, while 9.9 (or 9.90) has a 0.
Since 1 is greater than 0, 9.11 is larger than 9.9.

Q: Isn't 9.9 the same as 9.90?

GPT 4o: Yes, 9.9 is the same as 9.90. Adding a zero at the end of a decimal number does not change its value. Therefore, when comparing 9.11 and 9.9 (or 9.90):

Both numbers have 9 in the units place.
Both numbers have 9 in the tenths place.
In the hundredths place, 9.11 has a 1, while 9.9 (or 9.90) has a 0.
Since 1 is greater than 0, 9.11 is indeed larger than 9.9.

Here is my conversation with LLAMA 3.1 405B:

Q: which is bigger 9.11 or 9.9?

LLAMA 3.1: 9.11 is bigger than 9.9.
Here's a comparison:
9.9 = 9 and 9/10
9.11 = 9 and 11/100
Since 11/100 is greater than 9/10, 9.11 is the larger number.

Q: Isn't 9.9 the same thing as 9.90?

LLAMA 3.1: 9.9 is indeed the same as 9.90. I apologize for the mistake earlier.
So, comparing 9.90 and 9.11, it's clear that 9.90 is actually larger than 9.11.
Thank you for catching that mistake and I'm glad we could clarify this together!

Note how LLAMA 3.1 took the hint and corrected the answer while GPT 4o completely failed.

MykolaDolgalov
Автор

Just a heads up that your tests are invalid since you're using the same conversation for each additional test. This pollutes the LLM context and doesn't reflect the way these are typically used.

A more accurate validation approach would be to start a new conversation for each test.

dansadventures
Автор

Thank you for removing the fluff and noise about Llama 405B and showing us the reality. Really useful comparisons between the top AI offerings.

freeelectron
Автор

Truly an amazing time. Now if we would stop competing toward a futile effort of world domination, we could all COOPERATE toward a utopia.

NakedSageAstrology
Автор

I can't believe facebook is the one to release an opensource ai model

daedaluxe
Автор

Man, we need more people like you, your channel and informations are so valueable <3 thanks for all your hard works. please do not stop.

uh also i noticed creating "RushHour" game, in python, is challenging for AIs more than Tetris and snake, seems that creating a BFS logic is challenging, also not letting cars overlap or pass through each other.

ElectronicBoxOfficial
Автор

I like your narrative style. It's crisp, clear and succinct.

glomerol
Автор

12:50 Yes, the input is converted to tokens which are then converted to vectors. In the process, the word's original letters get lost, only "meaning" remains, so the LLM can't count letters very well. It is indeed a fundamental flaw. I suspect you could add some module to the system that preserves the original text.

StefanReich
Автор

Unless I’m mistaken, a lot of what you’ve shown in the video is essentially ChatGPT being better at almost everything over Meta’s 405B. I thought the title of the video was supposed to imply that Meta’s was revolutionary, but it looks like it falls behind quite a lot.

exactspace
Автор

this multimodal 4o thingy is not more than marketing right now

eSKAone-
Автор

This is a solid AI news channel and probably my favorite. It even has adorable things like pop-up memes for the 15-25 year olds that keeps their attention when the grownups are using too many words. That's a nice touch like at a restaurant when they give the kids crayons.

endoflevelboss
Автор

Love the vid as always, are you going to create a video about tenerborad and how to use it in rvc mangio?
I just cant get it to work, and i know that your tutorials always work!

StoryTimeFromRedditEveryHours
Автор

Can you do a tutorial on Llama 3.1 running via Ollama Client localized?!
I think you would do a better one than any ive seen. Love the way you edit a video :)
Thanks for the video was a treat to see in the notifications man! <3

TuxedoMaskMusic
Автор

12:30 ‘for some reason’ = tokenisation. LLMs tokenise text into larger chunks than individual characters, and these are then embedded into a large vector space. The ‘count the number of rs in strawberry’ task is asking it to remember every token, its association with word level structures, and query them crisply in the transformer process. If ‘rr’ is a token (two byte tokens are common), it has to have learnt during training that ‘rr’ has two rs.

Tl;dr, LLMs don’t see the raw text to count, but an embedded version of a tokenisation. This is much more complicated to work with for this task.

productjoe