NVIDIA GTC 2025: GPU, Tokens, Collaborations

0
Mar 20, 2025

While Data is the new oil, “tokens” are the new currency in which information is retrieved or generated. This tokenization process, of converting data into usable, inferenced information leveraging trained AI models, is driving the AI economy. However, this requires a significant amount of compute processing and thus power. However, compute resources are applied in different ways and follow the three laws — pre-training, post-training and test-time scaling (inference) as the AI “reasoning” models become more complex, thereby requiring more accuracy.

As we move to the agentic era, for organizations to scale models to reason effectively they will need to follow the scaling process at every step from training to inferencing. Jensen’s vision and announcements at NVIDIA GTC 2025 focused on building “AI Factories” across industries from Enterprise IT to Cloud and Robotics.

Source: NVIDIA & Counterpoint Research

To make AI factories successful, NVIDIA continued innovating and offering a full AI stack, including silicon, system and software to accelerate and scale AI with the highest levels of efficiency. The company’s approach scales across Agentic AI and Physical AI. NVIDIA made the following announcements across its stack:

Silicon: Significant announcements from Compute Roadmap to Silicon Photonics

Source: NVIDIA

  • NVIDIA’s silicon portfolio spans CPU, GPU and Networking (for scale-up and scale-out).

  • NVIDIA announced its latest Blackwell Ultra AI Factory platform, the GB300 NVL72, with 1.5x more AI performance over its GB200 NVL72

  • NVIDIA shared its silicon roadmap so the industry can plan their CAPEX investments prudently as they procure Blackwell systems now to potentially upgrade from Hopper to Rubin or Feynman in the coming years.

  • Rubin & Rubin Ultra sport two-reticle and four-reticle size GPUs offer up to 50PF and 100PF at FP4 with 288GB HBM4 and 1TB HBM4e launching in the second half of 2026 and 2027, respectively.

  • The new Vera CPU sports 88 custom cores built on top of Arm’s designs with more memory, bandwidth with 2x performance vs. the company’s Grace CPU, but it consumes just 50W of energy. Vera will follow a two-year cadence as we saw with NVIDIA’s Grace CPU.

  • Further, for scale out, NVIDIA’s new Spectrum-X Photonics and co-packaged optics (CPO) networking switches will help AI factories to scale out to millions of GPUs.

  • NVIDIA is fusing silicon photonics directly into switches with the industry’s first 3D-stacked Silicon Photonics Engine using a TSMC process to save megawatts of power compared to traditional approaches, solving a big bottleneck.


Additional Reading: Competitive Dynamics in HBM


Analyst Takeaways:

  • From the silicon perspective, NVIDIA is leading the industry with AI compute momentum to offer a “foundational platform” for AI.

  • Jensen opening up its roadmap to the industry empowers partners so they can confidently plan and build the AI infrastructure for next five years to tackle all forms of AI efficiently

  • The scalability of the architecture from 50FP to 1FP is the key driver for NVIDIA to tackle AI, from training to inferencing.

  • NVIDIA could become the largest customer for TSMC in the coming years as the complex “AI Reasoning” models drive more compute demand alongside the move to Physical AI, thereby growing demand for efficient tokenization and accuracy.

Systems: From Blackwell Ultra DGX SuperPOD to DGX Spark to Accelerate Inferencing

Source: NVIDIA

  • To offer FP4 precision and faster AI reasoning, accelerating token generation, NVIDIA announced its latest DGX SuperPOD infrastructure built with Blackwell Ultra GPUs.

  • This kickstarts enterprises’ ability to run the ever-growing workloads, from generative and agentic to physical AI, which demands significant computing resources across all the three scaling laws efficiently.

  • The DGX SuperPOD GB300 sports 36 NVIDIA Grace CPUs and 72 NVIDIA Blackwell Ultra GPUs with 70x more AI performance than Hopper systems as well as fifth-generation NVLink tech and a massive system for shared memory via a NVLink Switch system.

  • This is powered by 18 NVIDIA BlueField-3 DPUs paired with NVIDIA Quantum-X800 InfiniBand or NVIDIA Spectrum-X Ethernet and can scale up to thousands of NVIDIA GB Ultra chips.

  • Also announced, Blackwell RTX Pro for workstations to drive AI-powered visualization, simulation, scientific research and design workloads

  • DGX Spark (NVIDIA Digits announced at CES) and DGX Station in partnership with MediaTek, NVIDIA announced its 1 PF “personal” supercomputer system powered by GB10 and optimized for a desktop form-factor for fine-tuning and inference

Analyst Takeaways:

  • From a systems perspective, these announcements offer a broad scale of compute systems for a range of vendors, from Dell and Lenovo to white-box players, such as QCT, Inventec, Compal, Wistron and others, enabling them to offer powerful range of AI solutions and capture value on top of it

  • Jensen’s view of the future of computing / PCs is the company’s DGX Workstation which brings powerful AI-native computing to the mainstream. We believe it is a powerful offering for developers, creators and researchers to leverage NVIDIA’s compute solutions and related ecosystem, enabling them to build and consume AI experiences natively.

  • This will eventually directly hit the Mac Studio market as an alternative powerful solution

Software: NVIDIA Dynamo, Halos, Aerial, Omniverse, Cosmos, CUDA-X Foundational for Building AI Factories Across Industries


NVIDIA Dynamo

Source: NVIDIA

  • Software and algorithms are the cornerstones enabling NVIDIA’s silicon to be “programmable” and “efficiently manage AI workloads.

  • NVIDIA has built a series of platforms to help develop and consume different types of AI, from perception to generative to agentic to physical across industries including enterprise, cloud, automotive, and robotics.

  • NVIDIA Dynamo was one of the most important announcements, an open-source software framework, paired with their silicon, to maximize the programmability and efficiencies of inferencing AI reasoning models.

  • Dynamo enables multi-GPU scheduling, supports disaggregated serving for independent fine tuning of each phase of a model, an inference-optimized library, intelligent offloading and loading of inference data, all reducing server costs and response times, thereby boosting the user experience

  • Custom reasoning AI is the next frontier, and running popular reasoning models, such as DeepSeek-R1 with Dynamo, is estimated to boost token generation by over 30x per GPU.



Additional Reading: DeepSeek's New Efficient AI Models Challenge Industry's High-cost Paradigm


NVIDIA Halos

Source: NVIDIA

  • Safety is paramount to the functioning of autonomous vehicles.

  • Today, NVIDIA launched an umbrella platform — Halos, which brings together NVIDIA’s holistic safety system across its broader platform — SoC and OS level to algorithm — data creation, training, and simulation, and ecosystem, including workflows, evaluations and constant feedback

  • So, looking at its portfolio, this scales across NVIDIA DGX (training data), Omniverse, and Cosmos (simulation and synthetic data generation) to NVIDIA AGX Hyperion (Drive SoC, DriveOS and sensor fusion).

  • This has helped NVIDIA attract automotive OEMs (GM, Hyundai, Mercedes, Toyota, Tesla, Waymo, Li Auto and more), tier-n suppliers (Magna, Lenovo, etc.) and ecosystem players (Uber Freight, Gatik, etc.) which are adopting this multiple computer AI and safety platform to build end-to-end AI factories for developing safe autonomous vehicles.

NVIDIA CUDA-X

Source: NVIDIA

  • NVIDIA released CUDA in 2006 and since then has used it to develop applications over the last two decades to take advantage of compute resources for accelerated computing.

  • NVIDIA CUDA-X is a series of microservices, GPU-accelerated libraries, tools, and cloud API packages built on top of NVIDIA CUDA. It is designed to help developers across industries, such as science, engineering and signal processing, to develop, deploy and scale AI applications optimized to use high-performance compute.

  • Already, more than one million developers are developing AI applications using CUDA-X. It has become the default and most popular AI programming software language and toolkit.

  • NVIDIA announced CUDA-X is available for every industry.

NVIDIA AERIAL – End-to-End AI-powered 6G Telco Stack

  • With rise of autonomous networks and RAN driven by AI as we enter the 5.5G and 6G eras, NVIDIA doesn’t want to be left behind and is expanding its platform to the upcoming AI-native networks in the 6G era.

  • NVIDIA is collaborating with different industry players, including T-Mobile, Cisco, MITRE, ODC, Booz Allen Hamilton and more, to develop an AI-native 6G stack based on its AI Aerial platform, thereby expanding its work on software-defined-radio for RANs.

  • This includes an end-to-end approach, from AI-embedded radios, baseband, core, and digital twins of the entire network to AI-driven applications and services on top.

  • The announcement entails the company’s Aerial Omniverse Digital Twin Service, the Aerial Commercial Test Bed on NVIDIA MGX, and the NVIDIA Sionna 1.0 open-source library for 6G research.

Other announcements:

Source: NVIDIA

  • Announcement of Isaac GR00T N1, world’s first foundation model for training humanoid robots with skills and reasoning. It has a dual-system architecture to help it think slow and fast for better decision making and to mirror human reflexes, respectively.

  • Developers can leverage Omniverse-Cosmos to train robots with synthetic data and fine tune their behavior using different scenarios.

  • Since robotics is all about physics, NVIDIA is also partnering with Google DeepMind and Disney Research to develop an open-source Newton Physics Engine enabling robots to learn how to handle complex tasks with greater precision and also accelerate robotics-related ML workloads.

  • For more details on NVIDIA Omniverse and Cosmos, and how they are driving Physical AI to enable robotics and autonomous systems, refer to the following analysis.


Watch: A Counterpoint Conversation with NVIDIA's Ali Kani on the Physical AI and Autonomous Vehicles.



Analyst Takeaways:

  • NVIDIA is technically a software company making the compute or silicon highly programmable to help industries build powerful accelerated applications and services.

  • The company’s CUDA / CUDA-X and NIM microservices. along with the above software platforms, enable companies to build powerful workflows, provide tokenization efficiencies for HPC, enhances safety standards for autonomous vehicles, and improves software-defined intelligence in telco networks.

  • The different NVIDIA AI Blueprints, which are reference designs for AI workflows, enable many developers to accelerate their development of AI applications.

  • Its world foundation models for synthetic data generation and its fully customizable, open humanoid robot foundation model gives further impetus to NVIDIA to create new industries, driving adoption of its three computer solutions.

  • Software expertise is the biggest differentiator for NVIDIA and the glue binding its efforts that drive adoption of its highly capable silicon.

  • This gap in software expertise is something the competition is struggling to catch up to, and the gap widens with every GTC.

Summary

Published

Mar 20, 2025

Author

Neil Shah

Neil is a sought-after frequently-quoted Industry Analyst with a wide spectrum of rich multifunctional experience. He is a knowledgeable, adept, and accomplished strategist. In the last 18 years he has offered expert strategic advice that has been highly regarded across different industries especially in telecom. Prior to Counterpoint, Neil worked at Strategy Analytics as a Senior Analyst (Telecom). Neil also had an opportunity to work with Philips Electronics in multiple roles. He is also an IEEE Certified Wireless Professional with a Master of Science (Telecommunications & Business) from the University of Maryland, College Park, USA.