filmov
tv
Accelerating AI inference workloads
Показать описание
Deploying AI models at scale demands high-performance inference capabilities. Google Cloud offers a range of cloud tensor processing units (TPUs) and NVIDIA-powered graphics processing unit (GPU) VMs. Join Debi Cabrera as she sits down with Alex Spiridonov, Group Product Manager, to discuss key considerations for choosing TPUs and GPUs for your inference needs. Watch along and understand the cost implications, how to deploy and optimize your inference pipeline on Google Cloud, and more!
Chapters:
0:00 - Meet Alex
2:52 - Balancing cost and efficiency
5:51 - TPU vs GPU for AI models
8:21 - Getting started with Google Cloud TPUs and GPUs
10:05 - Common challenges when using inference optimization
12:10 - Available resources for AI inference workloads
13:13 - Wrap up
Resources:
#GoogleCloudNext #GoogleGemini
Event: Google Cloud Next 2024
Speakers: Debi Cabrera, Alex Spiridonov
Products Mentioned: Cloud TPUs, Cloud GPUs
Chapters:
0:00 - Meet Alex
2:52 - Balancing cost and efficiency
5:51 - TPU vs GPU for AI models
8:21 - Getting started with Google Cloud TPUs and GPUs
10:05 - Common challenges when using inference optimization
12:10 - Available resources for AI inference workloads
13:13 - Wrap up
Resources:
#GoogleCloudNext #GoogleGemini
Event: Google Cloud Next 2024
Speakers: Debi Cabrera, Alex Spiridonov
Products Mentioned: Cloud TPUs, Cloud GPUs