Nvidia debuts the Groq 3 language processing unit, a dedicated inference chip for multi

Nvidia Corp. Unveils Groq 3 for Advanced AI Inference

Nvidia Corp. has officially debuted the Groq 3 language processing unit, a dedicated inference chip designed for multi-agent workloads. This marks the first product stemming from Nvidia’s strategic licensing of Groq Inc.’s technology and the subsequent hiring of its founder and president, Jonathan Ross, alongside Sunny Madra. The announcement was made on March 16 2026 at 20:55 EDT, signaling a significant expansion of Nvidia’s AI hardware portfolio beyond its traditional GPU offerings.

The Groq 3 language processing unit is specifically engineered for artificial intelligence inference, a departure from Nvidia’s GPUs which are primarily general-purpose chips capable of both training and running AI models. While Nvidia’s GPUs offer greater memory capacity, Groq 3 boasts faster memory speeds, crucial for low-latency applications. This new chip is designed to address the increasing demands of agentic systems, supporting large context windows and requiring rapid responsiveness for continuous communication between multiple agents.

Groq 3 Integration with Vera Rubin for Enhanced Agentic Systems

Nvidia is positioning the Groq 3 LPX server racks, each equipped with 256 Groq 3 LPUs, to work in tandem with its recently unveiled Vera Rubin NVL72 rack. The Vera Rubin NVL72 rack integrates Rubin GPUs and Vera CPUs, creating a powerful combined system. This synergistic approach is optimized to efficiently handle trillion-parameter models and the extensive context requirements of millions of tokens, a significant leap in capability for complex AI operations.

The company’s strategic goal is to support throughputs of up to 1,500 tokens per second for agentic communications. This ambitious target is driven by the understanding that while 100 tokens per second might suffice for human interaction, such speeds would be inadequate for the continuous, rapid communication necessary for sophisticated agentic systems. Nvidia is aiming to enable a reality where multi-agent systems can communicate incessantly and with high responsiveness, a vision that necessitates this new generation of specialized inference hardware.

Nvidia Expands Data Center Hardware with New Racks and Networking

Beyond the Groq 3, Nvidia’s GTC 2026 event also saw the unveiling of several other key hardware components. The company debuted the Vera Rubin NVL72 rack and a separate dedicated Vera CPU rack. Additionally, Nvidia introduced a new storage rack system named Bluefield-4 STX and the Spectrum-6 SPX networking rack. These additions underscore Nvidia’s comprehensive strategy to provide end-to-end solutions for data center infrastructure.

These developments arrive at a time of unprecedented growth in the AI sector. Nvidia’s data center revenue alone soared to $193.5 billion in fiscal 2026. Furthermore, major hyperscale cloud providers, including Amazon Web Services Inc., Google LLC, Microsoft Corp., and Meta Platforms Inc., are collectively investing $650 billion in data center buildouts within the current year. This massive investment landscape highlights the intense demand for advanced AI hardware and infrastructure, with Nvidia’s expanded product line poised to capture a significant portion of this market.

✨ Intelligent Curation Note

This article was processed by AI Universe’s Intelligent Curation system. We’ve decoded complex technical jargon and distilled dense data into this high-impact briefing.
Estimated time saved: ~5 minutes of reading.

Analysis based on reports from SiliconANGLE News. Written by AI Universe News.