New LLM BEATS LLaMA3 - Fully Tested

preview_player
Показать описание
Qwen2 was released and I tested the biggest and smallest versions of it.

Join My Newsletter for Regular AI Updates 👇🏼

Need AI Consulting? 📈

My Links 🔗

Media/Sponsorship Inquiries ✅

Links:

Disclosure:
I'm an investor in LMStudio
Рекомендации по теме
Комментарии
Автор

Which size of Qwen 2 will you be using?

matthew_berman
Автор

I suggest you retest LlaMA3. I tested it yesterday with several from your rubric. It aced the killers test, the marble test, and the apple test. After it passed the apple test, I threw it a twist by prompting: Give me 10 sentences with the third word in each sentence being the result of mixing red and blue colors. It used purple as the third word in all 10 sentences.

JustinArut
Автор

Protip: if you change the system prompt midway through a conversation it'll break the output. When changing the prompt start a new convo with no history

sckpuppt
Автор

The larger model's method of getting every sentance to end with Apple was pretty funny if you taker a closer look 😂

meh
Автор

Apple fooled Mat... It's at the end of each sentence but in some childish way!

hotbit
Автор

Always excellent reviews from your channel. Thank you.

TroyDoesAI
Автор

Congrats! on the new Dell Precision 5860 Tower I had kind of given up on seeing LLM reviews run on windows on your channel. I hope that this is a worry of the past! Post the Full Specs if you would to give us an idea of the performance of these models on this machine!

RWS
Автор

I'm more impressed by the 0.5b model than the large one. If you compare this with the previous models these answers are amazing.

jamesjonnes
Автор

The small model gives such bad answers, it's actually funny

zenn
Автор

MBPP -- (mostly basic python programs).
A14B -- It is a 57 billion parameter LLM with an additional 14 billion parameter vision module (hence the "A14B" in the name), making it a multimodal model capable of processing both text and images.

zeloguy
Автор

Will the uncensored version know about Tiananmen?

patrickwasp
Автор

57B-A14B means "57B parameters" (how much memory you need) "14B active parameters" (how fast it goes). Would be the same as Mixtral 8x7B, in terms of memory requirements and speed.

..
Автор

Hey Mathew.
I have noticed that these models format their responses in latex. And it is correct latex code. Could it be a setting in your gui that you can edit to inorive the response.

This is a defect of your gui I believe not the model

mbalireshawal
Автор

this was the best answer for the shirt problem so far

edsonjr
Автор

Thanks, I try it, It's awesome ❤

AliAlias
Автор

Thank you Matthew. I wish you could have tested the 7B and/or the MoE 57B since those can be run locally for many users.

abdelhakkhalil
Автор

I still maintain that the marble in the glass question needs to add a description to be more like "glass with an open top". I think as human's we're assuming that the word "glass" means a drinking container that is open on top but I don't think you can assume that a language model will default to that.

davidlavin
Автор

I'm going to take a wild guess and say that most of your followers don't have a GPU that can run the 72b model at a reasonable speed, and the 0.5b model is so bad as to not be very useful. I would prioritize the 7b model since more people are likely to use that one.

dr_harrington
Автор

15:45 it follows instructions very well even if it's incapable of forming a proper sentence with an apple at the end.

homematvej
Автор

8:53 - when someone told you the answer and your teacher said "explain your working out" lol

tonyppe
join shbcf.ru