Tech »  Topic »  Introducing auto scaling on Amazon SageMaker HyperPod

Introducing auto scaling on Amazon SageMaker HyperPod


Today, we’re excited to announce that Amazon SageMaker HyperPod now supports managed node automatic scaling with Karpenter, so you can efficiently scale your SageMaker HyperPod clusters to meet your inference and training demands. Real-time inference workloads require automatic scaling to address unpredictable traffic patterns and maintain service level agreements (SLAs). As demand spikes, organizations must rapidly adapt their GPU compute without compromising response times or cost-efficiency. Unlike self-managed Karpenter deployments, this service-managed solution alleviates the operational overhead of installing, configuring, and maintaining Karpenter controllers, while providing tighter integration with the resilience capabilities of SageMaker HyperPod. This managed approach supports scale to zero, reducing the need for dedicated compute resources to run the Karpenter controller itself, improving cost-efficiency.

SageMaker HyperPod offers a resilient, high-performance infrastructure, observability, and tooling optimized for large-scale model training and deployment. Companies like Perplexity, HippocraticAI, H.AI, and Articul8 are already using SageMaker HyperPod for training ...


Copyright of this story solely belongs to aws.amazon.com - machine-learning . To see the full text click HERE