NVIDIA CES Keynote: Vera Rubin Platform, Open AI Models, and “Thinking” Agentic Systems Drive Compute Surge

NVIDIA (NASDAQ:NVDA) used its CES keynote in Las Vegas to frame artificial intelligence as a dual platform shift that is reshaping both how software is built and the hardware it runs on. The company argued that the traditional five-layer computing stack is being reinvented, with software increasingly “trained” rather than programmed and executed on GPUs rather than CPUs, while AI-powered applications generate content dynamically “every single time.”

AI platform shifts and growing compute intensity

The presentation described a broad industry transition toward AI-driven development and deployment, including what the speaker called “test-time scaling,” or real-time “thinking,” alongside pre-training and post-training with reinforcement learning. The keynote linked these developments to rapidly rising compute demand, citing an increasing need for computation across pre-training, reinforcement-learning post-training, and inference as models generate more tokens and spend more time reasoning.

The keynote also pointed to the rise of “agentic systems” beginning in 2024 and proliferating in 2025—models that can reason, research, use tools, plan, and simulate outcomes. An internal example cited was Cursor, which the company said changed software programming workflows at NVIDIA. The speaker also highlighted the growth of open models, including DeepSeek R1 as an open reasoning system, and said open models have “reached the frontier,” though described as still roughly six months behind leading frontier systems.

Open model efforts and DGX Cloud infrastructure

NVIDIA said it has been building and operating its own AI supercomputers—DGX Clouds—not to become a cloud provider, but to support internal development of open models. The company described its work across multiple domains and emphasized making models, libraries, and in some cases data available openly.

Projects cited included:

  • Digital biology: La Protina for generating proteins, OpenFold3 for protein structure, and Evo2 for understanding and generating multiple proteins.
  • AI physics and weather: Earth-2, along with FourCastNet and CorrDiff for weather prediction.
  • Foundation and robotics models: Cosmos as an “open world foundation model,” GR00T for humanoid robotics, and work in autonomous driving discussed later in the keynote.
  • Nemotron: described as a hybrid transformer/SSM model; the company referenced Nemotron-3 and said additional versions are expected “in the near future.”

NVIDIA also described its NeMo family of libraries—Physics NeMo, Clara NeMo, and BioNeMo—as end-to-end lifecycle systems for AI, spanning data processing, training, evaluation, guardrails, and deployment. The company positioned these tools as foundational to building AI agents and said several models top leaderboards for tasks including PDF retrieval and parsing, speech recognition, and retrieval/search.

Enterprise agentic AI partnerships and an example “blueprint”

In describing how AI applications are evolving, NVIDIA emphasized an architecture in which agents route tasks among multiple models—potentially across different clouds and edge environments—selecting the right model for each job. The keynote introduced the term “blueprints” to describe packaged frameworks for building these agentic applications and said they are being integrated into enterprise SaaS platforms.

During a demonstration, the speaker described building a personal assistant that used a DGX Spark system and combined a frontier model API with a locally run open model for private email tasks, routed by an intent-based model router. The system was also connected to Hugging Face’s Reachy mini robot and ElevenLabs for voice.

NVIDIA highlighted enterprise integrations and collaborations including Palantir, ServiceNow, Snowflake, CodeRabbit, CrowdStrike, and NetApp, arguing that agentic systems will become the interface layer for enterprise platforms—supplanting traditional UI paradigms by enabling multimodal interaction.

Physical AI: simulation, synthetic data, and autonomous driving

The keynote reiterated NVIDIA’s long-running focus on “physical AI,” describing a three-computer approach: systems for training, systems for inference at the edge (in robots, cars, factories), and systems for simulation. NVIDIA positioned Omniverse as its digital twin simulation environment, Cosmos as a world foundation model aligned with language, and robotics models including GR00T and a newly announced autonomous driving model.

NVIDIA argued that physical AI is constrained by limited real-world data and said the solution is synthetic data grounded in physics. The company described Cosmos as trained on “internet-scale video,” real driving and robotics data, and 3D simulation, and said Cosmos can generate physically plausible video and multi-camera environments from sources such as telemetry logs, scene descriptions, planning simulators, and prompts. The keynote said Cosmos has been downloaded “millions of times.”

NVIDIA then announced AlphaMyo, described as “the world’s first thinking, reasoning, autonomous vehicle AI.” The model was presented as trained end-to-end from camera input to actuation output, using a mix of human demonstration miles, Cosmos-generated miles, and “hundreds of thousands” of carefully labeled examples. NVIDIA said AlphaMyo not only outputs driving controls but also explains the actions it is about to take, the reasons, and the trajectory, and said the model is open-sourced.

The keynote discussed a partnership with Mercedes and said the first “NVIDIA-first” AV car is expected on the road in the U.S. in Q1, Europe in Q2, and Asia in the second half of the year. The speaker also described a dual-stack approach for safety: AlphaMyo’s end-to-end model alongside a separate “classical AV stack” designed for traceability, with a safety policy evaluator selecting between them.

New hardware: Vera Rubin in production and system-level redesign

NVIDIA also unveiled details of its next computing platform, Vera Rubin, named after astronomer Vera Rubin. The company said Vera Rubin is in “full production,” positioning it as necessary to meet surging AI compute needs driven by larger models, reinforcement learning post-training, and higher token generation from test-time scaling.

Key elements described included:

  • Vera CPU: described as a custom CPU with 88 cores and 176 threads using “Spatial Multi-Threading,” with the company claiming doubled performance over the prior generation and improved performance per watt.
  • Rubin GPU: described as delivering “5x Blackwell” floating performance while having about 1.6x the transistor count, enabled in part by MVFP4 Tensor Core technology.
  • Networking and interconnect: ConnectX-9 (1.6 Tb/s scale-out bandwidth per GPU) and BlueField-4 DPUs; NVLink 6 switch chips operating at 400 Gb/s, with the NVLink72 rack described as enabling GPUs to communicate at very high aggregate bandwidth.
  • Spectrum-X Ethernet Photonics: described as the “world’s first Ethernet switch with 512 lanes” and co-packaged optics, manufactured using a TSMC process NVIDIA called CoPoS.

The company emphasized “extreme co-design” across chips and systems to push performance beyond the limits implied by slowing transistor scaling, and described significant mechanical and thermal changes, including moving from complex cable-heavy assemblies to redesigned chassis intended to reduce assembly time. NVIDIA also said Vera Rubin maintains 45°C liquid cooling water temperature, avoiding the need for water chillers, and cited energy-efficiency benefits.

NVIDIA introduced a new rack-level approach to context memory management for AI inference, describing BlueField-4 as enabling high-speed storage of key-value (KV) cache near the compute. The keynote described “Dynamo” KV cache management and said each GPU could access additional context memory via rack-level resources.

Additional system claims included confidential computing features—encryption “in transit, at rest, and during compute”—and “power smoothing” to reduce the need for over-provisioning data center power to handle instantaneous spikes.

Closing the keynote, NVIDIA described its role as delivering a full stack—from chips to infrastructure to models and applications—intended to enable partners and developers to build AI systems across enterprise software, robotics, and industrial simulation.

About NVIDIA (NASDAQ:NVDA)

NVIDIA Corporation, founded in 1993 and headquartered in Santa Clara, California, is a global technology company that designs and develops graphics processing units (GPUs) and system-on-chip (SoC) technologies. Co-founded by Jensen Huang, who serves as president and chief executive officer, along with Chris Malachowsky and Curtis Priem, NVIDIA has grown from a graphics-focused chipmaker into a broad provider of accelerated computing hardware and software for multiple industries.

The company’s product portfolio spans discrete GPUs for gaming and professional visualization (marketed under the GeForce and NVIDIA RTX lines), high-performance data center accelerators used for AI training and inference (including widely adopted platforms such as the A100 and H100 series), and Tegra SoCs for automotive and edge applications.

Read More