Best Practices: How To Build Scalable Data Pipelines for Machine Learning

preview_player
Показать описание
Data engineers today serve a wider audience than just a few years ago. Companies now need to apply machine learning (ML) techniques on their data in order to remain relevant. Among the new challenges faced by data engineers is the need to build and fill Data Lakes as well as reliably delivering complete large-volume data sets so that data scientists can train more accurate models.

Aside from dealing with larger data volumes, these pipelines need to be flexible in order to accommodate the variety of data and the high processing velocity required by the new ML applications. Qubole addresses these challenges by providing an auto-scaling cloud-native platform to build and run these data pipelines.

In this webinar we will cover:

- Some of the typical challenges faced by data engineers when building pipelines for machine learning
- Typical uses of the various Qubole engines to address these challenges.
- Real-world customer examples
Рекомендации по теме
Комментарии
Автор

Thanks for putting this together. Very helpful!!!!

kurtcampher