filmov
tv
Stanford CS25: V4 I Behind the Scenes of LLM Pre-training: StarCoder Use Case
Показать описание
May 23, 2024
Speaker: Loubna Ben Allal, Hugging Face
As large language models (LLMs) become essential to many AI products, learning to pretrain and fine-tune them is now crucial. In this talk, we will explore the intricacies of training LLMs from scratch, including lessons on scaling laws and data curation. Then, we will study the StarCoder use case as an example of LLMs tailored for code, highlighting how their development differs from standard LLMs. Additionally, we will discuss important aspects of data governance and evaluation, crucial elements in today's conversations about LLMs and AI that are frequently overshadowed by the pre-training discussions.
About the speaker: Loubna Ben Allal is a Machine Learning Engineer in the Science team at Hugging Face working on Large Language Models for code & Synthetic data generation. She is part of the core team behind the BigCode Project and has co-authored The Stack dataset and StarCoder models for code generation. Loubna holds Mathematics & Deep Learning Master's Degrees from Ecole des Mines de Nancy and ENS Paris Saclay.
Speaker: Loubna Ben Allal, Hugging Face
As large language models (LLMs) become essential to many AI products, learning to pretrain and fine-tune them is now crucial. In this talk, we will explore the intricacies of training LLMs from scratch, including lessons on scaling laws and data curation. Then, we will study the StarCoder use case as an example of LLMs tailored for code, highlighting how their development differs from standard LLMs. Additionally, we will discuss important aspects of data governance and evaluation, crucial elements in today's conversations about LLMs and AI that are frequently overshadowed by the pre-training discussions.
About the speaker: Loubna Ben Allal is a Machine Learning Engineer in the Science team at Hugging Face working on Large Language Models for code & Synthetic data generation. She is part of the core team behind the BigCode Project and has co-authored The Stack dataset and StarCoder models for code generation. Loubna holds Mathematics & Deep Learning Master's Degrees from Ecole des Mines de Nancy and ENS Paris Saclay.
Stanford CS25: V4 I Behind the Scenes of LLM Pre-training: StarCoder Use Case
Stanford CS25: V4 I Overview of Transformers
Stanford CS25: V4 I Hyung Won Chung of OpenAI
Stanford CS25: V4 I Jason Wei & Hyung Won Chung of OpenAI
Stanford CS25: V4 I Aligning Open Language Models
Stanford CS25: V4 I Demystifying Mixtral of Experts
Stanford CS25: V4 I From Large Language Models to Large Multimodal Models
Stanford CS25: V4 I Transformers that Transform Well Enough to Support Near-Shallow Architectures
Stanford CS25: V3 I Retrieval Augmented Language Models
stanford cs25 v4 i hyung won chung of openai
Stanford CS25: V3 I How I Learned to Stop Worrying and Love the Transformer
Stanford CS25: V3 I Beyond LLMs: Agents, Emergent Abilities, Intermediate-Guided Reasoning, BabyLM
Which jobs will AI replace first? #openai #samaltman #ai
[VIET] Stanford CS25: V4 I Overview of Transformers - Part 1 (Phiên bản lồng tiếng)
Stanford CS25: V1 I Transformers United: DL Models that have revolutionized NLP, CV, RL
Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer
Stanford CS25: V1 I Self Attention and Non-parametric transformers (NPTs)
The Possibilities of AI [Entire Talk] - Sam Altman (OpenAI)
Segredos da IA: Insights da OpenAI na Stanford CS25 #IA #STANFORD #XMACNA #shorts
How I'd learn ML in 2025 (if I could start over)
Stanford CS25: V1 I DeepMind's Perceiver and Perceiver IO: new data family architecture
clear voice CS25 Transformers United 2023 Introduction to Transformers w Andrej Karpathy
Stanford CS25: V1 I Transformers in Vision: Tackling problems in Computer Vision
Andrew Ng: Advice on Getting Started in Deep Learning | AI Podcast Clips
Комментарии