High-Performance Input Pipelines for Scalable Deep Learning

preview_player
Показать описание
A production AI system is more than just training a deep learning model. It also includes 1) ingesting and running inference on new data, 2) transformation, processing, and cleaning new data to incorporate it into the training set, 3) continuously retraining to update and continue learning, and 4) experimental pipeline to test improvements to the AI models. This presentation focuses on the importance of high-performance and highly-scalable storage that is needed to take advantage of ever-larger datasets in model training.

We describe the common stages in an input pipeline for deep learning training and describe their resource requirements. We then present a benchmark-based approach for identifying bottlenecks in the pipeline, utilizing the Imagenet dataset to show linear scaling of training performance from 1 GPU to 32 GPUs. The AI-ready infrastructure presented here achieves the goal of providing scalable training performance with simplicity to eliminate the need for complex configuration and tuning of infrastructure components.

Speaker: Joshua Robinson, Founding Engineer, PureStorage
Рекомендации по теме