Nvidia released Nemotron 3 Nano Omni on Tuesday
thenextweb.comNvidia released Nemotron 3 Nano Omni on Tuesday, an open-weight multimodal AI model that unifies vision, audio, and language understanding in a single architecture designed to power autonomous AI agents on edge devices. The model has 30 billion parameters but activates only three billion per forward pass through a mixture-of-experts design, a ratio that allows it to run on a single GPU while matching or exceeding the multimodal capabilities of models several times its size. Nvidia claims nine times higher throughput than comparable open multimodal models with equivalent interactivity, 2.9 times faster single-stream reasoning on multimodal tasks, and roughly nine times greater effective system capacity for video reasoning. The model tops six benchmarks across document intelligence, video understanding, and audio comprehension. It processes text, images, audio, video, documents, charts, and graphical interfaces as inputs and produces text as output, meaning a single model can replace the patchwork of specialised ...
Copyright of this story solely belongs to thenextweb.com . To see the full text click HERE

