The Berkeley Data Analytics Stack: Present and Future - DataEDGE 2015

preview_player
Показать описание
The Berkeley Data Analytics Stack: Present and Future
Thursday, May 7, 2015

The Berkeley Algorithms, Machines, and People Laboratory (AMPLab) is creating a new approach to data analytics. The lab is realizing its ideas through the development of a freely-available Open Source software stack called BDAS: the Berkeley Data Analytics Stack. In the four years the lab has been in operation, we've released major components of BDAS. Several of these components have deeply influenced current Big Data practice: the Mesos cluster resource manager, the Spark in-memory computation framework, and the Tachyon distributed storage system. BDAS features prominently in many industry discussions of the future of the Big Data analytics ecosystem - a rare degree of impact for an ongoing academic project. In this talk I will give an overview of BDAS with an emphasis on how we provide an integrated environment for SQL processing, Graph analytics, Streaming, and Machine Learning at scale. I'll then describe our current and planned efforts for moving "up the stack" including new components such as the Velox and MLBase machine learning platforms, and the SampleClean framework for hybrid human/computer data cleaning.

Michael Franklin will present an overview of the AMPLab and will be followed by Ali Ghodsi of Databricks who will demonstrate how to use Spark and other BDAS components in the Databricks Cloud.

Michael Franklin
Thomas M. Siebel Professor of Computer Science and Chair of the Computer Science Division,
UC Berkeley

Professor Franklin is a co-PI and Executive Committee member for the Berkeley Institute for Data Science, part of a multi-campus initiative to advance Data Science Environments. He is an ACM Fellow, a two-time winner of the ACM SIGMOD "Test of Time" award, has several recent "Best Paper" awards and two recent CACM Research Highlights selections, and is recipient of the outstanding Advisor Award from the Computer Science Graduate Student Association at Berkeley.

Ali Ghodsi
Co-Founder
Databricks

Ali Ghodsi is a cofounder of Databricks and currently heads engineering and product management. Prior to that he was an assistant professor at KTH/Sweden and a visiting researcher at UC Berkeley since 2009. He holds a PhD in Computer Science from KTH/Sweden, and an MBA from Mid-Sweden University.
Рекомендации по теме