AMD Zyphra GPU Cluster Gives Birth To ZAYA 1 MoE AI Model, Smokes Llama3.1
hothardware.comAMD is in a celebratory mood after AI research firm Zyphra successfully trained its cutting-edge, large-scale Mixture-of-Experts (MoE) model, ZAYA1, entirely on AMD’s accelerated computing platform, which consists of Instinct MI300X GPUs, Pensando Pollara 400 networking hardware, and the ROCm software stack.
What are MoEs, exactly? You can think of them as breaking up a single, very large language model into say eight individual experts that each have their own area of expertise - one language, one reason, one image recognition, and so forth.
Then there's an intermediary model in front of those expert models that takes input and basically decides, 'Okay, this workload needs experts two, four, six, and eight, and with these weights on each of them'. It's an oversimplified explanation, but enough to get the gist.
From AMD's vantage point, the achievement proves its platform is a viable, high-performance, and production-ready alternative to scale ...
Copyright of this story solely belongs to hothardware.com . To see the full text click HERE

