Apache Spark for Machine Learning on Large Data Sets • Juliet Hougland • YOW! 2017

preview_player
Показать описание
This presentation was recorded at YOW! 2017. #GOTOcon #YOW

Juliet Hougland - Data Science Tech Lead for Engineering at Cloudera @juliethougland325

RESOURCES

ABSTRACT
Apache Spark is a general purpose distributed computing framework for distributed data processing. With MLlib, Spark’s machine learning library, fitting a model to a huge data set becomes very easy.

Similarly, Spark’s general purpose functionality enables application of a model across a large collection of observations. We’ll walk through fitting a model to a big data set using MLlib and applying a trained #scikitlearn model to a large data set. [...]

RECOMMENDED BOOKS

#ApacheSpark #Spark #MLlib #ML #MachineLearning #SoftwareEngineering #JulietHoughland #Programming #YOWcon

CHANNEL MEMBERSHIP BONUS
Join this channel to get early access to videos & other perks:

Looking for a unique learning experience?

SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.
Рекомендации по теме
Комментарии
Автор

We are currently releasing older YOW! videos to serve as a valuable archive, preserving historical content. It is possible that a video is perceived as outdated. We believe it offers insightful glimpses into the past, enriching our understanding of history and development.

Looking for books & other references mentioned in this video?
Check out the video description for all the links!

Want early access to videos & exclusive perks?

Question for you: What’s your biggest takeaway from this video? Let us know in the comments! ⬇

GOTO-
Автор

In 38 minutes this video made my manager understand why I think Spark is wonderfull. Thanks for releasing it, even if its a few years old.

ebusdk