Choosing AI infrastructure is no longer just about GPUs. It is about how you want to build and scale. NVIDIA’s DGX and HGX platforms represent two very different approaches to AI compute, and understanding that distinction is becoming critical as workloads grow. For teams moving beyond experimentation into sustained AI workloads, the question is not which is better, but which model aligns with how you want to operate: https://civo.io/4cauuJY
Civo’s Post
More Relevant Posts
-
How ‘Why Not’ Led to a $20 Billion Deal For Groq Source: EE Times Nvidia’s $20B Groq Deal Redefines AI Inference: LPUs + GPUs Unlock Premium Token Economics *At GTC 2026, Nvidia unveiled a blockbuster deal: licensing Groq’s LPU technology and hiring its team for a reported $20 billion—a move born from nearly a year of quiet co-engineering.* Behind the Deal: From “Why Not” to Christmas Day The partnership traces back to early 2025, when Groq approached Nvidia to explore using NVLink—a protocol originally for GPU-GPU and CPU connectivity—for its LPU accelerators. Nvidia CEO Jensen Huang’s response: “Why not.” Within weeks, the teams demonstrated a disaggregated LLM inference architecture, splitting workloads across GPUs and LPUs. Three weeks later, the deal closed. By Christmas, Groq CEO Jonathan Ross had joined Nvidia as chief software architect. Why the Pairing Works: Heterogeneous Inference The technical rationale lies in complementary strengths: Nvidia Vera Rubin GPUs excel at high-throughput prefill and attention layers. Groq LP30 LPUs specialize in low-latency token generation (feed-forward network), delivering ultra-fast per-user token speeds. Combined in racks—one Vera Rubin to one to four Groq LPX racks—the system achieves 35× the token throughput for high-interactivity workloads compared to Rubin alone. The Business Case: Speed as a Premium According to Huang, this enables a new pricing model: low-speed tokens become low-value or free, while high-speed “premium” tokens (200–400 tokens/sec/user) command higher margins. “This is probably the single most important chart for the future of AI factories,” Huang said, estimating the revenue opportunity at nearly $300 billion per gigawatt for Nvidia customers. Technology & Roadmap The Groq LP30 chip, now part of Nvidia’s Rubin lineup, features 500 MB of SRAM and a compile-time scheduler. Each LPX rack houses 256 LPUs using Groq’s proprietary chip-to-chip links, with NVLink-C2C planned for the next-generation Groq 4. Nvidia is integrating Groq’s compiler and inference-sharding software into its Dynamo orchestration stack. Ian Buck, Nvidia’s VP of hyperscale, noted that Rubin CPX—an in-house disaggregation project—has been deprioritized in favor of Groq’s proven solution. For AI infrastructure & inference trends: Our analysts track: Heterogeneous computing architectures AI factory economics & tokenization models Next-gen inference disaggregation 📩 Contact our semiconductor team for deeper analysis! 📫 dukehuang@usemiltd.com #AI #INFERENCE #GROQ #NVIDIA #GPU #ICS #MERGERS #ACQUISITIONS #SEMICONDUCTORS
To view or add a comment, sign in
-
-
NVIDIA’s Vera Rubin platform is no longer just a GPU, it is a seven-chip unified system designed to turn traditional data centers into "AI Factories" and by integrating the GPU, HBM4, and specialized DPU/NIC chips into a single "Rubin Tray", NVIDIA has created a hardware stack that can manufacture intelligence at an industrial scale. This architectural change marks the end of the "General Purpose" data center in favor of specialized, AI-dedicated infrastructure. #NVIDIA #TechGiants #ChipMaker #VeraRubin #AIFactory #DataCenter #DataCenters #Innovation #Semiconductors #HardwareNews #AI #Chips #AIChips #TechnologyNews https://lnkd.in/g5qaP3Zg
To view or add a comment, sign in
-
NVIDIA spent $20 billion on a chip that isn't a GPU. On Christmas Eve 2025, they acquired Groq. Not for its brand. For its entire patent portfolio, its team, and a chip designed for one thing only: inference. Three months later at GTC 2026, Jensen Huang unveiled the Groq 3 LPU. 150 TB/s memory bandwidth per chip. Deterministic execution. No CUDA. No HBM. Pure SRAM. The GPU company just admitted GPUs alone can't win the inference era. We broke down the deal, the architecture, the 35x throughput claim, and what this actually means for teams renting GPUs today. Every claim sourced. NVIDIA projections labeled. Independent benchmarks cited separately. Full analysis: https://lnkd.in/g3BUqEkm #NVIDIA #GTC2026 #AI #MachineLearning #Inference
NVIDIA Spent $20 Billion Because GPUs Alone Can't Win the Inference Era | Barrack AI blog.barrack.ai To view or add a comment, sign in
-
The GPU company spent $20 billion on a chip that isn't a GPU. We wrote the analysis nobody else published.
NVIDIA spent $20 billion on a chip that isn't a GPU. On Christmas Eve 2025, they acquired Groq. Not for its brand. For its entire patent portfolio, its team, and a chip designed for one thing only: inference. Three months later at GTC 2026, Jensen Huang unveiled the Groq 3 LPU. 150 TB/s memory bandwidth per chip. Deterministic execution. No CUDA. No HBM. Pure SRAM. The GPU company just admitted GPUs alone can't win the inference era. We broke down the deal, the architecture, the 35x throughput claim, and what this actually means for teams renting GPUs today. Every claim sourced. NVIDIA projections labeled. Independent benchmarks cited separately. Full analysis: https://lnkd.in/g3BUqEkm #NVIDIA #GTC2026 #AI #MachineLearning #Inference
To view or add a comment, sign in
-
NVIDIA Vera Rubin is opening the next frontier of AI. #NVIDIAGTC news: The Vera Rubin platform’s seven chips are now in full production to scale the world’s largest AI factories. Vera CPU, Rubin GPU, NVLink 6, ConnectX-9, BlueField-4, Spectrum-6 and Groq 3 work together as one AI supercomputer powering every phase of AI. https://bit.ly/4sWu4xX
To view or add a comment, sign in
-
✍️Inference : '..News Summary: The NVIDIA Vera Rubin platform is opening the next AI frontier with: Vera Rubin NVL72 GPU racks Vera CPU racks NVIDIA Groq 3 LPX inference accelerator racks NVIDIA BlueField-4 STX storage racks NVIDIA Spectrum-6 SPX Ethernet racks GTC—NVIDIA today announced the NVIDIA Vera Rubin platform is opening the next frontier of agentic AI, with seven new chips now in full production to scale the world’s largest AI factories. The platform brings together the NVIDIA Vera CPU, NVIDIA Rubin GPU, NVIDIA NVLink™ 6 Switch, NVIDIA ConnectX®-9 SuperNIC, NVIDIA BlueField®-4 DPU and NVIDIA Spectrum™-6 Ethernet switch, as well as the newly integrated NVIDIA Groq 3 LPU. Designed to operate together as one incredible AI supercomputer, the chips power every phase of AI — from massive-scale pretraining, post-training and test-time scaling to real-time agentic inference. “Vera Rubin is a generational leap — seven ...' - Extract #skdscans #nvidianews #chips
To view or add a comment, sign in
-
The NVIDIA Vera Rubin platform is opening the next frontier of AI. At #NVIDIAGTC, we announced that the NVIDIA Vera Rubin platform's seven new chips are in full production to scale the world’s largest AI factories. The platform brings together the Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet switch, as well as the newly integrated NVIDIA Groq 3 LPU, to operate together as one incredible AI supercomputer to power every phase of #AI. Learn more: https://bit.ly/4toF3Qf #NVIDIAVeraRubin
To view or add a comment, sign in
-
Fresh from NVIDIA GTC 2026: Docker Model Runner now runs on the NVIDIA DGX Station GB300. Here's why this matters for developers and teams: → 748 GB of coherent memory means you can run trillion-parameter models without quantization tricks → NVIDIA MIG support means one DGX Station can serve as a shared AI dev node for your whole team each person gets a sandboxed model endpoint → 7.1 TB/s bandwidth means switching between models in agentic pipelines is dramatically faster We're entering an era where "I need cloud GPUs" is no longer the default answer. At Collabnix, we've been tracking Docker Model Runner since its early days, and the pace of iteration has been remarkable. From local inference on a laptop to serving teams from a desk-side supercomputer ~ all with the same `docker model pull` workflow. If you haven't tried Docker Model Runner yet, this is a great time to start: https://lnkd.in/gNmnyBxv Full blog post 👇 https://lnkd.in/g94edqhp #Docker #NVIDIA #DGXStation #AI #Agents #MLOps #CloudNative #Collabnix
To view or add a comment, sign in
-
The NVIDIA Vera Rubin platform is opening the next frontier of AI. At #NVIDIAGTC, we announced that the NVIDIA Vera Rubin platform's seven new chips are in full production to scale the world’s largest AI factories. The platform brings together the Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet switch, as well as the newly integrated NVIDIA Groq 3 LPU, to operate together as one incredible AI supercomputer to power every phase of #AI. Learn more: https://bit.ly/4cEaqAO #NVIDIAVeraRubin
To view or add a comment, sign in
-
The NVIDIA Vera Rubin platform is opening the next frontier of AI. At #NVIDIAGTC, we announced that the NVIDIA Vera Rubin platform's seven new chips are in full production to scale the world’s largest AI factories. The platform brings together the Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet switch, as well as the newly integrated NVIDIA Groq 3 LPU, to operate together as one incredible AI supercomputer to power every phase of #AI. Learn more: https://bit.ly/4efgcdu #NVIDIAVeraRubin
To view or add a comment, sign in
More from this author
Explore related topics
- How to Choose the Right AI Infrastructure
- Choosing The Right AI Models For Enterprises
- Understanding AI Infrastructure Build-Out Costs
- Understanding AI Data Centers and Infrastructure
- How Nvidia is Transforming AI Infrastructure
- Building Scalable AI Infrastructure
- How AI Models Affect Infrastructure Requirements
- How to Build a Strong AI Infrastructure
- How to Scale Foundation Models for AI Infrastructure
- How to Build Data Infrastructure for AI Innovation
GPUs are table stakes now. How you orchestrate and scale them is the real differentiator.