AWS re:Invent 2023 - Scaling FM inference to hundreds of models with Amazon SageMaker (AIM327)

Показать описание

Companies need robust and cost-effective solutions to deploy foundation models (FMs) at scale. Additionally, SaaS providers need scalable and cost-effective ways to serve hundreds of models to their customers. This session explores how to use Amazon SageMaker to roll out hundreds of FMs cost effectively at scale. Get a detailed overview of deployment strategies to support large-scale generative AI inferencing for SaaS, and learn how to architect solutions that maximize scaling capabilities for performance and cost.

Subscribe:

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#AWSreInvent #AWSreInvent2023

Рекомендации по теме

Комментарии

Great presentation Dhawal, Alan and Bhavesh !

rchadha

AWS re:Invent 2023 - Scaling FM inference to hundreds of models with Amazon SageMaker (AIM327)

AWS re:Invent 2023 - Scaling on AWS for the first 10 million users (ARC206)

AWS re:Invent 2023 - Harness the power of Karpenter to scale, optimize & upgrade Kubernetes (CON...

AWS re:Invent 2023 - AI-powered scaling and optimization for Amazon Redshift Serverless (ANT354)

AWS re:Invent 2023 - Best practices for serverless developers (SVS401)

AWS re:Invent 2023 - Strategies for automated scaling, remediation, and smart self-healing (ENT107)

AWS re:Invent 2023 - Scaling connected products and solutions with AWS IoT (IOT213)

AWS re:Invent 2023 - Scaling AWS Well-Architected best practices across your organization (ARC216)

AWS re:Invent 2023 - Scale organizational cloud knowledge & improve builder productivity (ARC207...

AWS re:Invent 2023 - Advanced serverless workflow patterns and best practices (API401)

AWS re:Invent 2023 - Unlock supplier diversity: Scaling diverse-owned businesses (IDE203)

AWS re:Invent 2023 - Resilient architectures at scale: Real-world use cases from Amazon.com (ARC305)

AWS re:Invent 2023 – Announcing Amazon SageMaker Innovations for Scaling with Models

AWS re:Invent 2023 - CEO Keynote with Adam Selipsky

AWS re:Invent 2023 - Amazon Neptune architectures for scale, availability, and insight (DAT406)

AWS re:Invent 2023 - SaaS deep dive: Inside a scalable, efficient multi-tenant architecture (SAS304)

AWS re:Invent 2023 - Scaling serverless data processing with Amazon Kinesis and Apache Kafka(SVS307)

AWS re:Invent 2023 - [LAUNCH] Achieving scale with Amazon Aurora Limitless Database (DAT344)

AWS re:Invent 2023 - Scaling FM inference to hundreds of models with Amazon SageMaker (AIM327)

AWS re:Invent 2023 - Scaling FinOps with a holistic approach to resource optimization (COP214)

AWS re:Invent 2023 - Optimizing for cost and performance with AWS App Runner (CON209)

AWS re:Invent 2023 - Data processing at massive scale on Amazon EKS (CON309)

AWS re:Invent 2023 - Netflix Maestro: Orchestrating scaled data & ML workflows in the cloud (NFX...

AWS re:Invent 2023 - Curate your data at scale (ANT205)

AWS re:Invent 2023 - Scale and accelerate the impact of generative AI with watsonx (AIM108)