Distributed ML with H2O feat. Erin LeDell | Stanford MLSys Seminar Episode 23

Показать описание

Episode 23 of the Stanford MLSys Seminar Series!

Scalable Machine Learning with H2O & Systems Approach to Algorithm Development
Speaker: Erin LeDell

Abstract:
The focus of this presentation is the scalable and distributed machine learning platform, H2O. The multi-node distributed algorithms (GLM, Random Forest, GBM, DNNs, etc) can train on datasets which are larger than RAM (of a single machine), and H2O integrates with other 'big data' systems, Hadoop and Spark. H2O is engineered for production use cases with a focus on fast training and prediction speeds. The second part of the talk will discuss a systems approach to developing novel machine learning algorithms such as H2O AutoML. Unlike well-defined ML algorithms (e.g. GBM), an 'AutoML' algorithm is an automated process which aims to train the best model (or ensemble) in a specified amount of time. I will discuss our methodology for experimentation and validation of new strategies or changes to the algorithm, using a benchmark-driven systems approach.

Speaker bio:

--
0:00 Starting Soon
4:42 Presentation
38:30 Discussion

The Stanford MLSys Seminar is hosted by Dan Fu, Karan Goel, Fiodar Kazhamiaka, and Piero Molino, Chris Ré, and Matei Zaharia.

Twitter:

--

#machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford #h2oai #scalableml