NVIDIA Blackwell GPUs Supercharge MoE AI, Boosting Performance Up to 10X
hothardware.comLet's talk about the latest AI models, which are mostly powered by something called a "Mixture of Experts" design. Mixture of Experts (MoE) is a form of model sparsity, but we'll talk about that more in a minute. The key takeaways from this post are that the broad majority of 'frontier' language models are based on MoE, and NVIDIA says that Blackwell is up to ten times faster than its previous-generation Hopper systems at running them.
For clarity, the company is talking about its GB200 NVL72 machines. These are the big racks you've seen in pictures like the one above; they really don't look like typical server racks due to half the rack being network hardware and everything requiring liquid cooling. The compute density is impressive—5.76 PFLOPS for single-precision—and they have the ability to focus the entire power of the whole rack on ...
Copyright of this story solely belongs to hothardware.com . To see the full text click HERE

