Lecture 04: Data Management (FSDL 2022)

preview_player
Показать описание
New course announcement ✨

We're teaching an in-person LLM bootcamp in the SF Bay Area on November 14, 2023. Come join us if you want to see the most up-to-date materials building LLM-powered products and learn in a hands-on environment.

Hope to see some of you there!

--------------------------------------------------------------------------------------------- In this video, we cover the data stack from how data is stored and versioned to how it is processed and annotated.

00:00 Key points
01:18 Sources of data: filesystems, latency numbers, object stores, databases, data warehouses
10:48 Exploring data
12:08 Processing data
15:50 Feature stores
17:17 Summary of best practices and some sample datasets
20:31 Self-supervised learning and data labeling
29:52 Data versioning

Рекомендации по теме
Комментарии
Автор

Thank you for the lecture, I earn so much from these lectures by you guys

edd
Автор

Data visualization would be a good topic to include here. Sometimes it can be really hard to visualize your labeled datasets or visualize your predictions, especially for video or 3D.

iantimmis
Автор

Why did you delete my last comment? Which I have asked Sergey to accept my invitation on Linkedin already and I convinced him I wont send him messages at 3 am.
P.S I got a thought that mb you deleted my previous comment becaue somehting happend to Sergey and it looked like I was a troll?

CantPickTheNameIwant