AWS re:Invent 2023 - [LAUNCH] Introducing Amazon SageMaker HyperPod (AIM362)

preview_player
Показать описание
Amazon SageMaker HyperPod is purpose-built to accelerate foundation model (FM) training. Join this session to learn how to train FMs for weeks and months without disruption with Amazon SageMaker HyperPod. Discover how it continuously monitors cluster health and repairs and replaces faulty nodes on the fly to automatically resume training without losing progress. Learn about how it is preconfigured with SageMaker distributed training libraries that make it possible for you to improve FM training performance by making it easy to split training data and FMs into smaller chunks and process them in parallel across the cluster nodes while fully utilizing the cluster’s compute and network infrastructure.

Subscribe:

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#AWSreInvent #AWSreInvent2023
Рекомендации по теме