Amazon SageMaker AI in 2025, a year in review part 2: Improved observability and enhanced features for SageMaker AI model customization and hosting
aws.amazon.com - machine-learningIn 2025, Amazon SageMaker AI made several improvements designed to help you train, tune, and host generative AI workloads. In Part 1 of this series, we discussed Flexible Training Plans and price performance improvements made to inference components.
In this post, we discuss enhancements made to observability, model customization, and model hosting. These improvements facilitate a whole new class of customer use cases to be hosted on SageMaker AI.
Observability
The observability enhancements made to SageMaker AI in 2025 help deliver enhanced visibility into model performance and infrastructure health. Enhanced metrics provide granular, instance-level and container-level tracking of CPU, memory, GPU utilization, and invocation performance with configurable publishing frequencies, so teams can diagnose latency issues and resource inefficiencies that were previously hidden by endpoint-level aggregation. Rolling updates for inference components help transform deployment safety by alleviating the need for duplicate infrastructure provisioning—updates deploy in configurable batches with integrated Amazon ...
Copyright of this story solely belongs to aws.amazon.com - machine-learning . To see the full text click HERE

