Stanford CS25: V4 I Behind the Scenes of LLM Pre-training: StarCoder Use Case

preview_player
Показать описание
May 23, 2024
Speaker: Loubna Ben Allal, Hugging Face

As large language models (LLMs) become essential to many AI products, learning to pretrain and fine-tune them is now crucial. In this talk, we will explore the intricacies of training LLMs from scratch, including lessons on scaling laws and data curation. Then, we will study the StarCoder use case as an example of LLMs tailored for code, highlighting how their development differs from standard LLMs. Additionally, we will discuss important aspects of data governance and evaluation, crucial elements in today's conversations about LLMs and AI that are frequently overshadowed by the pre-training discussions.

About the speaker: Loubna Ben Allal is a Machine Learning Engineer in the Science team at Hugging Face working on Large Language Models for code & Synthetic data generation. She is part of the core team behind the BigCode Project and has co-authored The Stack dataset and StarCoder models for code generation. Loubna holds Mathematics & Deep Learning Master's Degrees from Ecole des Mines de Nancy and ENS Paris Saclay.

Рекомендации по теме
Комментарии
Автор

Scrape first, filter later, opt out last is the precise opposite of "open & responsible research". It is mass copyright violation as a matter of first principle.

vocesanticae