ModCon 2023 Breakout Session: MAX Engine Performance

preview_player
Показать описание
In this session, Modular engineers Abdul Dakkak and Hengjie Wang discuss Modular AI Engine performance across models and hardware architectures. They dive deep into how AI Engine works and show its performance against Pytorch and TensorFlow and demonstrate how AI Engine scales to models of all sizes including LLMs.

- 00:00 Introduction and performance numbers
- 04:33 Runtime, compiler, and kernels working in unison
- 05:30 Runtime parralelism and memory management
- 05:50 Moving transforms out of inference to model initialization
- 06:04 Automatic fusion of graphs to a single op
- 06:30 Specialized kernels on dimensions
- 08:49 Simplification with Mojo
- 10:26 Generality across hardware
- 11:51 Cross platform development example
- 13:22 Kernel JIT
- 13:43 Develepor friendly
- 14:02 Autotuning, Custom Ops, Multi-model support
- 14:58 Stable diffusion example
Рекомендации по теме