ModCon 2023 Breakout Session: MAX Engine Performance

Показать описание

In this session, Modular engineers Abdul Dakkak and Hengjie Wang discuss Modular AI Engine performance across models and hardware architectures. They dive deep into how AI Engine works and show its performance against Pytorch and TensorFlow and demonstrate how AI Engine scales to models of all sizes including LLMs.

- 00:00 Introduction and performance numbers
- 04:33 Runtime, compiler, and kernels working in unison
- 05:30 Runtime parralelism and memory management
- 05:50 Moving transforms out of inference to model initialization
- 06:04 Automatic fusion of graphs to a single op
- 06:30 Specialized kernels on dimensions
- 08:49 Simplification with Mojo
- 10:26 Generality across hardware
- 11:51 Cross platform development example
- 13:22 Kernel JIT
- 13:43 Develepor friendly
- 14:02 Autotuning, Custom Ops, Multi-model support
- 14:58 Stable diffusion example

Рекомендации по теме

ModCon 2023 Breakout Session: MAX Engine Performance

ModCon 2023 Breakout Session: MAX Engine Performance

ModCon 2023 Breakout Session: MAX Engine Extensibility

ModCon 2023 Breakout Session: MAX Heterogenous Compute: CPU + GPU

ModCon 2023 Breakout Session: MAX Development to Production in the Cloud