Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East talk by DB Tsai

Показать описание

Netflix is the world’s largest streaming service, with 80 million members in over 250 countries. Netflix uses machine learning to inform nearly every aspect of the product, from the recommendations you get, to the boxart you see, to the decisions made about which TV shows and movies are created.
Given this scale, we utilized Apache Spark to be the engine of our recommendation pipeline. Apache Spark enables Netflix to use a single, unified framework/API – for ETL, feature generation, model training, and validation. With pipeline framework in Spark ML, each step within the Netflix recommendation pipeline (e.g. label generation, feature encoding, model training, model evaluation) is encapsulated as Transformers, Estimators and Evaluators – enabling modularity, composability and testability. Thus, Netflix engineers can build our own feature engineering logics as Transformers, learning algorithms as Estimators, and customized metrics as Evaluators, and with these building blocks, we can more easily experiment with new pipelines and rapidly deploy them to production.

In this talk, we will discuss how Apache Spark is used as a distributed framework we build our own algorithms on top of to generate personalized recommendations for each of our 80+ million subscribers, specific techniques we use at Netflix to scale, and the various pitfalls we’ve found along the way.

Рекомендации по теме

Комментарии

It seems I'll have to join them to learn answers to those interesting questions at the end ;) Or maybe I'll wait one year and see if they have open sourced the tech.

nikonyrh

I'm sure it's very interesting, but I can't understand anything :(

SiliconPowerII

Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East talk by DB Tsai

Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East talk by DB Tsai

Real-time Processing with Flink for Machine Learning at Netflix - Elliot Chow

Performance Optimization of Recommendation Training Pipeline at Netflix - Hua Jiang & DB Tsai

AWS re:Invent 2020: Designing better ML systems: Learnings from Netflix

How to Design and Build a Recommendation System Pipeline in Python (Jill Cates)

Data Engineering Interview - Netflix Clickstream Data Pipeline

Machine Learning Infrastructure for Netflix Recommendations

Movie Recommendation System with Collaborative Filtering

Live Implementation Of Movie Recommendation With Deployment Using Heroku

Spotify's music recommender algorithm: How it works

Netflix Machine Learning Mock Interview: Type-ahead Search

AWS re:Invent 2017: Orchestrating Machine Learning Training for Netflix Recommendati (MCL317)

Scaling Push Messaging for Millions of Devices @Netflix

Building Recommender Systems with Machine Learning and AI

Netflix Recs. Using Spark + Cassandra (Prasanna Padmanabhan & Roopa Tangirala) | C* Summit 2016

Movie Recommender System Project | Content Based Recommender System with Heroku Deployment

How Recommendation Systems Work On Amazon & Netflix | Simplilearn Webinar

Building a Machine learning model for movies recommendation using C# ML .NET

How to Build a Recommendation Engine

Tutorial 5- Content Based Recommendation System

Building a Recommendation Engine with Machine Learning Techniques (Brian Sam-Bodden) - FSF 2016

Kafka in 100 Seconds

Why Netflix's Recommendations are Getting Better

Massive Scale Data Processing at Netflix using Flink - Snehal Nagmote & Pallavi Phadnis