Broadcast News

Bookmark and Share
29/04/2026

Nvidia Unveils Nemotron 3 Nano Omni To Unite Vision, Audio And Language

News Image
Nvidia has launched Nemotron 3 Nano Omni, an open "omni‑modal" reasoning model designed to process video, audio, images and text within one architecture, reducing the hand‑offs between separate vision, speech and language models that add latency and lose context.

The company says the model sets a new efficiency mark for open multimodal systems, combining strong perception accuracy with lower cost, and topping six leaderboards spanning complex document intelligence plus video and audio understanding. According to Nvidia, systems built on the model can achieve up to ninefold higher throughput than other open omni models with similar interactivity.

Nemotron 3 Nano Omni integrates vision and audio encoders inside a 30B‑A3B hybrid mixture‑of‑experts design, eliminating the need for standalone perception components and improving inference efficiency at scale. Its architecture includes Conv3D, EVS and a 256K context window, and it accepts inputs across text, images, audio, video, documents, charts and graphical interfaces, producing text outputs.

In multi‑agent setups, the model can act as the "eyes and ears", working alongside proprietary cloud models or other open Nemotron models — such as Nemotron 3 Super for high‑frequency execution and Nemotron 3 Ultra for complex planning — to power sub‑agents for computer use, document intelligence and audio‑video reasoning.

Early use cases include computer‑use agents that navigate GUIs and reason over on‑screen content at native 1920×1080 resolution, where preliminary OSWorld benchmark results indicate gains in handling complex interfaces; document intelligence that coherently interprets charts, tables, screenshots and mixed media for enterprise analysis and compliance; and audio/video understanding that maintains context across what was shown and said in a single reasoning stream.

"To build useful agents, you can't wait seconds for a model to interpret a screen," said Gautier Cloix, CEO of H Company. "By building on Nemotron 3 Nano Omni, our agents can rapidly interpret full HD screen recordings — something that wasn't practical before. This isn't just a speed boost: It's a fundamental shift in how our agents perceive and interact with digital environments in real time."

Adopters already include Aible, Applied Scientific Intelligence (ASI), Eka Care, Foxconn, H Company, Palantir and Pyler, with Dell Technologies, Docusign, Infosys, K‑Dense, Lila, Oracle and Zefr evaluating the model.

Nvidia is releasing Nemotron 3 Nano Omni with open weights, datasets and training methods, enabling organisations to customise, evaluate and optimise the model for domain‑specific tasks using tools such as Nvidia NeMo. Because the Nemotron family is open, deployments can be aligned with regulatory, sovereignty and data localisation requirements.

Availability starts 28 April 2026 via Hugging Face, OpenRouter and build.nvidia.com as an Nvidia NIM microservice, and through a wide ecosystem of Nvidia Cloud Partners, inference platforms and cloud providers. The lightweight, open design supports consistent deployment from local systems like Nvidia Jetson hardware, Nvidia DGX Spark and DGX Station to data centre and cloud environments.

Nvidia says the Nemotron 3 family — spanning Nano, Super and Ultra — has been downloaded more than 50 million times in the past year, with Omni extending the line into multimodal and agentic workloads.

www.nvidia.com/en-gb/
VMI.TV Ltd

Top Related Stories
Click here for the latest broadcast news stories.

14/02/2023
Studio Technologies Unveils Model 545DC And Model 545DR Intercom Interfaces
Studio Technologies has unveiled the Model 545DC and Model 545DR Intercom Interfaces. The units allow users to utilize analog party-line (PL) intercom
02/03/2026
NVIDIA Unveils Open Telco AI Model
Open-source telco AI model, a practical implementation guide and multi-agent blueprints released via GSMA’s Open Telco AI aim to help operators train
01/06/2026
Nvidia Extends AI Cloud Ecosystem Worldwide
Nvidia says its AI Cloud ecosystem is rapidly scaling to support the global buildout of "AI factory" infrastructure, as partners add capacity for ente
06/01/2025
Cinegy Releases New Version Of Its Cinecoder Video Codec SDK
Cinegy has released a new version of its Cinecoder video codec SDK, solving a fundamental problem for TV and broadcast industries. The H.264 interlace
05/09/2025
Studio Technologies To Exhibit Four New Audio Solutions At IBC 2025
Studio Technologies, a leading manufacturer of high-quality audio, video, and fiber-optic solutions, will be showcasing four of its innovative product
02/06/2026
Studio Technologies Unveils Model 385 Mic/Intercom
Studio Technologies has introduced the Model 385 Mic/Intercom Beltpack, a compact, Dante-enabled unit aimed at broadcast and live-event audio teams us
28/11/2025
Pixellot Unveils New AI Revenue-Share Model
Pixellot has announced a major expansion of its commitment to Australian grassroots sport, introducing a new AI-driven revenue-share model and appoint
30/10/2025
Studio Technologies' Dante-Enabled Model 394 and 395 Interfaces Now Shipping
Studio Technologies has announced that its new Model 394 GPI Interface and Model 395 GPO Interface are now shipping. First previewed at IBC 2025 in Am
07/08/2025
DirectLight Slim Series Expands With New Low Power Model
Leyard Europe has announced the expansion of award-winning DirectLight® Slim Series line of fine pixel pitch LED video wall displays with the introduc
23/07/2025
Alibaba Unveils Cutting-Edge AI Coding Model Qwen3-Coder
Alibaba has launched Qwen3-Coder, its most advanced agentic AI coding model to date. Designed for high-performance software development, Qwen3-Coder e
26/06/2025
BBC Studios And BBC News Launch Pay Model
BBC Studios and BBC News have launched the first phase of a pay model for BBC.com visitors in the U.S., offering its most loyal users a premium experi
11/03/2025
Studio Technologies Unveils Model 201 Interpreter’s Console
Studio Technologies, manufacturer of high-quality audio, video, and fiber-optic solutions, announces its Model 201 Interpreter’s Console is now shippi
28/01/2025
ARRI Announces New ALEXA 35 Entry Model
Making its top-tier ALEXA 35 camera system more accessible for a wider range of users and production types, ARRI introduces a new entry model and flex
22/11/2024
Studio Technologies Unveils Model 352A & 354A Talk Stations
Studio Technologies presents the Model 352A and Model 354A Talk Stations. The units support Dante® audio-over-Ethernet digital media technology and ar
05/04/2024
EVS Unveils New Licensing Model For Its XT-VIA Live Production Server
EVS has unveiled a new licensing model for its flagship XT-VIA live production server, setting new standards for flexibility, accessibility, and user