Is Meta’s new AI really open source?

preview_player
Показать описание
Facebook/Meta have been big open source proponents for awhile, but the investment into open source AI is unrivaled. Super excited by what they're cooking, even if "open source" might not be the best term. Something something Llama 3.1

REFERENCES

S/O Ph4se0n3 for the awesome edit 🙏
Рекомендации по теме
Комментарии
Автор

My explanation of the "model size" was not great. I liked this more concise description from @essamal-mansoury2689

"let's say you have a line equation (f(x) = mx + b). The x is the input, the m and x are the biases and weights. Only two weights in my example. 405 billion in their case."

tdotgg
Автор

405b is the number of weights, not the amount of data it was trained against. For example, let's say you have a line equation (f(x) = mx + b). The x is the input, the m and b are the weights and biases. Only two weights in my example. 405 billion in their case.

essamal-mansouri
Автор

I think this video highlights that influencers talk about subjects like they understand them, but don't.

nuttygold
Автор

That's very much not what the "billion numbers" mean. It means how many parameters are in the neutral net. There's more data than that it was trained on.

zanfur
Автор

the source code used to train the model is also open only the dataset used in not available to the public. Also it's more accurate to say that the models are open weight than anything else because meta's licensing terms only allow for commercial use up till a certain threshold.

dungeon
Автор

llama source code is available actually, just a few hundred lines of pytorch, the data it was trained on is the part thats closed

MommysGoodPuppy
Автор

Totally agree, what meta is doing, is not open source. We also have no idea what datasets they pulled from, still.
These companies really do enjoy trying twist the term into anything that suits them. Had me excited for a bit, until I found out, oh it's just the model data.

Really not doing any of us any favors. They take all this university work, steal it under the guise they will be open with the results, lie, and then CLAIM it's open source.
Which is weird, because aren't they talking about companies being able to train their own models? Looks like the only way you can create a model at this time, is by using hugging face's web portal, which sort of defeats the entire purpose of being offgrid.
It's open model, not open source.

infinitivez
Автор

theo's slowly entering the world of practical comedy here

RajarshiKhatua
Автор

It is not totally open source but can't deny it's better than *open* AI

ugotisa
Автор

"Genuine" and Zuck in one sentence is a huge red flag, imo. If Theo ever uses "sincere" and Zuck in one sentence I'll start reconsidering my life choices...

vsolyomi
Автор

The model size is what the B means. Not the amount of data used for training

canofpulp
Автор

Even if you had the source to create the model, you don't have the data either, which Meta's own description describes as crucial for the formation (cleaned data) of the model. However, the model does include the weights, and it isn't "agentified" the way some other models are (Mistral). This allows you to do your own fine tuning and agentification with a lower incidence of loss.

nicosilva
Автор

4:45 the number is NOT the number of tokens... it's the number of parameters. Those two are very separate things.

television
Автор

Yea you should probably learn a bit more about this before releasing a video on it. Some pretty glaring mistakes in this one.

noone-ldpt
Автор

Now it makes perfect sense why Nvidia has been holding its low and mid-tier graphics cards to 8GB of VRAM, even though it would cost them $20-50 maximum to double that.

vorpled
Автор

Bryan "he who shall be <censored>" Lunduke had a great vid/article about this, the OSI themselves are pushing a not-very-open model for AI. Sure, the source code to build an AI is open source, but that's all. The important part (ie which creators have had their work stolen to build the dataset) is most definitely not included. There's a lot of money involved in this, so don't expect any of the open source foundations to care about open source any more.

gbjbaanb
Автор

Hey Theo!, thanks, I love how you broke down this post, that's super cool.

josechristianromero
Автор

I'm not listening to someone's AI takes when they don't even understand the title of the model. (I am listening though, but only for hate watching).

comradepeter
Автор

Funny how the thumbnail says misleading while the 405B explanation is straight up wrong HAHAHAHAHAHAHAHA

xmassive
Автор

@t3dotgg, great post and analysis. I enjoyed your boiled-down description of an LLM. I also like to think that training on a data set is similar to compression technology. You feed a model a lot of data, and you're left with a weighted neural net that represents that data but is much smaller than the training data.

ElevateConsultingDave