AMD researchers argue that, while algorithms like the Ozaki scheme merit investigation, they're still not ready for prime time.
theregister.co.ukDouble precision floating point computation (aka FP64) is what keeps modern aircraft in the sky, rockets going up, vaccines effective, and, yes, nuclear weapons operational. But rather than building dedicated chips that process this essential data type in hardware, Nvidia is leaning on emulation to increase performance for HPC and scientific computing applications, an area where AMD has had the lead in recent generations.
This emulation, we should note, hasn't replaced hardware FP64 in Nvidia's GPUs. Nvidia's newly unveiled Rubin GPUs still deliver about 33 teraFLOPS of peak FP64 performance, but that's actually one teraFLOP less than the now four-year-old H100.
If you switch on software emulation in Nvidia's CUDA libraries, the chip can purportedly achieve up to 200 teraFLOPS of FP64 matrix performance. That's 4.4x of what its outgoing Blackwell accelerators could muster in hardware.
On paper, Rubin isn't just Nvidia ...
Copyright of this story solely belongs to theregister.co.uk . To see the full text click HERE

