Tech »  Topic »  HyperPod now supports Multi-Instance GPU to maximize GPU utilization for generative AI tasks

HyperPod now supports Multi-Instance GPU to maximize GPU utilization for generative AI tasks


We are excited to announce the general availability of GPU partitioning with Amazon SageMaker HyperPod, using NVIDIA Multi-Instance GPU (MIG). With this capability you can run multiple tasks concurrently on a single GPU, minimizing wasted compute and memory resources that result from dedicating entire hardware (for example, entire GPUs) to tasks that can under-utilize the resources. By allowing more users and tasks to access GPU resources simultaneously, you can reduce development and deployment cycle times while supporting a diverse mix of workloads running in parallel on a single physical GPU, all without waiting for full GPU availability.

Data scientists run multiple lightweight tasks on reserved accelerated compute resources and need to drive efficient utilization for inference (for example, serving a language model), research (for example, model prototyping), and interactive tasks (for example, Jupyter notebooks for image classification experimentation). These tasks typically don’t require entire GPUs to run efficiently, let ...


Copyright of this story solely belongs to aws.amazon.com - machine-learning . To see the full text click HERE