Accelerating AI inference workloads

Показать описание

Deploying AI models at scale demands high-performance inference capabilities. Google Cloud offers a range of cloud tensor processing units (TPUs) and NVIDIA-powered graphics processing unit (GPU) VMs. Join Debi Cabrera as she sits down with Alex Spiridonov, Group Product Manager, to discuss key considerations for choosing TPUs and GPUs for your inference needs. Watch along and understand the cost implications, how to deploy and optimize your inference pipeline on Google Cloud, and more!

Chapters:
0:00 - Meet Alex
2:52 - Balancing cost and efficiency
5:51 - TPU vs GPU for AI models
8:21 - Getting started with Google Cloud TPUs and GPUs
10:05 - Common challenges when using inference optimization
12:10 - Available resources for AI inference workloads
13:13 - Wrap up

Resources:

#GoogleCloudNext #GoogleGemini

Event: Google Cloud Next 2024
Speakers: Debi Cabrera, Alex Spiridonov
Products Mentioned: Cloud TPUs, Cloud GPUs

Рекомендации по теме

Accelerating AI inference workloads

Accelerating AI inference workloads

Accelerate AI inference workloads with Google Cloud TPUs and GPUs

WG Serving: Accelerating AI/ML Inference Workloads on Kubernetes - E.A. Gutierrez, Y. Tang

Webinar: Introduction to tsunAImi – Accelerating AI Inference

What is AI Inference?

Nvidia CUDA in 100 Seconds

Webinar: Accelerating Deep Learning Inference Workloads at Scale

How to Simply Run Complex AI Training & Inference Workloads with Domino & NVIDIA

Jensen Huang Said 30 Nvidia Shares Will Make You A Millionaire | NVDA Stock News

Sponsored Keynote: Optimizing AI Inference for Large Language Models - Mudhakar Srivatsa, IBM

Scaling AI Inference Workloads with GPUs and Kubernetes - Renaud Gaubert & Ryan Olson, NVIDIA

Qualcomm: High Performance and Power Efficient AI Inference Acceleration

Accelerating AI Workloads with NVIDIA AI Enterprise

Untether AI: At Memory Computation A Transformative Compute Architecture for Inference Acceleration

Boost your AI inference, training, and storage systems with MangoBoost DPUs Presented by Ma

Using Software + Hardware Optimization to Enhance AI Inference Acceleration on Arm NPU

Buying a GPU for Deep Learning? Don't make this MISTAKE! #shorts

Demonstration: Accelerating CNN Inference in EVOLVE platform with Intel Stratix10 FPGAs

Neurala Accelerating AI Video Annotation with NGC Containers

Accelerate AI Inference for Computer Vision with OpenVINO™ Workflow Consolidation Tool

CPU vs GPU vs TPU explained visually

Webinar: Introduction to Tenstorrent Grayskull™ e150 – Accelerating AI Inference

Flex Logix: Performance Estimation and Benchmarks for Real-World Edge Inference Applications

AI Hardware: Training, Inference, Devices and Model Optimization