WHY AND HOW OF SCALING LARGE LANGUAGE MODELS | NICHOLAS JOSEPH

preview_player
Показать описание
Anthropic is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Over the past decade, the amount of compute used for the largest training runs has increased at an exponential pace. We've also seen in many domains that larger models are able to attain better performance following precise scaling laws. The compute needed to train these models can only be attained using many coordinated machines that are communicating data between them. In this talk, Nicholas Joseph (Technical Staff, Anthropic) goes through why and how they can scale up training runs to use these machines efficiently.
Рекомендации по теме
Комментарии
Автор

Underrated video. Appreciated the overview of the power law scaling curves

jordanburgess
Автор

Note that the graph at 3:04 is outdated because the 2022 Chinchilla paper says that compute, data and the number of model parameters should be scaled at equal rates.

stephenmcaleese
Автор

Did you observe though these same scaling laws when you trained and tested RL agents (of various types and including multi-objective agents) ? or such laws only appear on the specific case for LLMs

mikiallen
Автор

Hello PyTorch,
Sorry to bother you. Can I ask for your permission to reupload (repost) your videos to a Chinese video platform called Bilibili? Since Youtube is banned in China, it is quite hard for people to watch Youtube (using VPN), not to mention that a lot of Chinese people cannot speak fluent English. There are tons of fans in China wanting to learn from this topic, and I'm certain that many people will like your videos. I will make sure to ALWAYS include your channel's name, video link, and credit you as the original author in the video description. Anyway, I'm looking forward to your new videos and have a good day!
Thank you
Alin

茵林-cs
Автор

vocal fry is so distracting from the content... can't wait till this speech trend is over... Literally had to watch this with the sound off and captions turned on. Maybe they will make an AI audio filter that will remove vocal fry, that would be sweet as I find it too distracting.

foo_tube