ML-at-Scale '23 - LLM Batch Inference with Determined

preview_player

Показать описание

Speaker: Corey Staten
In this talk we used Determined's Core API and Hugging Face Transformers to build and optimize batch inference workflows. We also discussed some advanced parallelization techniques, and showed how to achieve them using Determined's DeepSpeed integration. Warning: This session is code-heavy!

Determined AI

Рекомендации по теме