Ray at Scale: Apple's Approach to Elastic GPU Management | Ray Summit 2024

Показать описание

As Ray's ecosystem expands, efficient GPU resource management becomes crucial for scaling AI/ML workloads. In this session, Apple's Weiwei Yang and Abin Shahab unveil their innovative approach to building a multi-tenancy ready platform based on Ray, tackling common challenges like GPU fragmentation, low utilization, and compromised SLAs.

Yang and Shahab delve into the intricacies of their queuing and GPU quota management system, powered by Apache YuniKorn. They explore advanced techniques for achieving resource fairness, GPU preemption, and gang scheduling across diverse Ray workloads. This talk offers valuable insights for organizations looking to optimize their GPU resource management and enhance the scalability and efficiency of their AI/ML operations.

--

Interested in more?

--

🔗 Connect with us: