CES 2026 showcases the accomplishment of the NVIDIA Rubin Platform, on with Azure’s proven readiness for deployment.
CES 2026 showcases the accomplishment of the NVIDIA Rubin platform, on with Azure’s proven readiness for deployment. Microsoft’s long-range datacenter strategy was engineered for moments precisely similar this, wherever NVIDIA’s next-generation systems slot straight into infrastructure that has anticipated their power, thermal, memory, and networking requirements years up of the industry. Our semipermanent collaboration with NVIDIA ensures Rubin fits straight into Azure’s guardant level design.
Building with intent for the future
Azure’s AI datacenters are engineered for the aboriginal of accelerated computing. That enables seamless integration of NVIDIA Vera Rubin NVL72 racks crossed Azure’s largest next-gen AI superfactories from existent Fairwater sites successful Wisconsin and Atlanta to aboriginal locations.
The newest NVIDIA AI infrastructure requires important upgrades successful power, cooling, and show optimization; however, Azure’s acquisition with our Fairwater sites and aggregate upgrade cycles implicit the years demonstrates an quality to flexibly heighten and grow AI infrastructure successful measurement with advancements successful technology.
Azure’s proven acquisition delivering standard and performance
Microsoft has years of market-proven acquisition successful designing and deploying scalable AI infrastructure that evolves with each large advancement of AI technology. In lockstep with each successive procreation of NVIDIA’s accelerated compute infrastructure, Microsoft rapidly integrates NVIDIA’s innovations and delivers them astatine scale. Our early, large-scale deployments of NVIDIA Ampere and Hopper GPUs, connected via NVIDIA Quantum-2 InfiniBand networking, were instrumental successful bringing models similar GPT-3.5 to life, portion different clusters acceptable supercomputing show records, demonstrating we tin bring next-generation systems online faster and with higher real-world show than the remainder of the industry.
We unveiled the archetypal and largest implementations of some NVIDIA GB200 NVL72 and NVIDIA GB300 NVL72 platforms, architected arsenic racks into azygous supercomputers which bid AI models dramatically faster, helping Azure stay a apical prime for customers seeking precocious AI capabilities.
Azure’s systems approach
Azure is engineered for compute, networking, storage, software, and infrastructure each moving unneurotic arsenic 1 integrated platform. This is however Microsoft builds a durable vantage into Azure and delivers outgo and show breakthroughs that compound implicit time.
Maximizing GPU utilization requires optimization crossed each layer. In summation to Azure being capable to follow NVIDIA’s caller accelerated compute platforms early, Azure advantages travel from the surrounding level arsenic well: high-throughput Blob storage, proximity placement and region-scale plan shaped by existent accumulation patterns, and orchestration layers similar CycleCloud and AKS tuned for low-overhead scheduling astatine monolithic clump scale.
Azure Boost and different offload engines wide IO, network, and retention bottlenecks truthful models standard smoothly. Faster retention feeds larger clusters, stronger networking sustains them, and optimized orchestration keeps end-to-end show steady. First enactment innovations reenforce the loop: liquid cooling Heat Exchanger Units support choky thermals, Azure hardware information module (HSM) silicon offloads information work, and Azure Cobalt delivers exceptional show and ratio for general-purpose compute and AI-adjacent tasks. Together, these integrations guarantee the full strategy scales efficiently, truthful GPU investments present maximum value.
This systems attack is what makes Azure acceptable for the Rubin platform. We are delivering caller systems and establishing an end-to-end level already shaped by the requirements Rubin brings.
Operating the NVIDIA Rubin platform
NVIDIA Vera Rubin Superchips volition present 50 PF NVFP4 inference show per chip and 3.6 EF NVFP4 per rack, a five times jump implicit NVIDIA GB200 NVL72 rack systems.
Azure has already incorporated the halfway architectural assumptions Rubin requires:
- NVIDIA NVLink evolution: The sixth-generation NVIDIA NVLink cloth expected successful Vera Rubin NVL72 systems reaches ~260 TB/s of scale-up bandwidth, and Azure’s rack architecture has already been redesigned to run with those bandwidth and topology advantages.
- High-performance scale-out networking: The Rubin AI infrastructure relies connected ultra-fast NVIDIA ConnectX-9 1,600 Gb/s networking, delivered by Azure’s web infrastructure, which has been purpose-built to enactment large-scale AI workloads.
- HBM4/HBM4e thermal and density planning: The Rubin representation stack demands tighter thermal windows and higher rack densities; Azure’s cooling, powerfulness envelopes, and rack geometries person already been upgraded to grip the aforesaid constraints.
- SOCAMM2 driven representation expansion: Rubin Superchips usage a caller representation enlargement architecture; Azure’s level has already integrated and validated akin representation hold behaviors to support models fed astatine scale.
- Reticle sized GPU scaling and multi-die packaging: Rubin moves to massively larger GPU footprints and multi-die layouts. Azure’s proviso chain, mechanical design, and orchestration layers person been pre-tuned for these carnal and logical scaling characteristics.
Azure’s attack successful designing for adjacent procreation accelerated compute platforms similar Rubin has been proven implicit respective years, including important milestones:
- Operated the world’s largest commercialized InfiniBand deployments crossed aggregate GPU generations.
- Built reliability layers and congestion absorption techniques that unlock higher clump utilization and larger occupation sizes than competitors, reflected successful our quality to people industry starring large-scale benchmarks. (E.g., multi-rack MLPerf runs competitors person ne'er replicated.)
- AI datacenters co-designed with Grace Blackwell and Vera Rubin from the crushed up to maximize show and show per dollar astatine the clump level.
Design principles that differentiate Azure
- Pod speech architecture: To alteration accelerated servicing, Azure’s GPU server trays are designed to beryllium rapidly swappable without requiring extended rewiring, improving uptime.
- Cooling abstraction layer: Rubin’s multi-die, precocious bandwidth components necessitate blase thermal headroom that Fairwater already accommodates, avoiding costly retrofit cycles.
- Next gen powerfulness design: Vera Rubin NVL72 request expanding watt density; Azure’s multi-year powerfulness redesign (liquid cooling loop revisions, CDU scaling, and precocious amp busways) ensures immediate deployability.
- AI superfactory modularity: Microsoft, dissimilar different hyperscalers, builds regional supercomputers alternatively than singular megasites, enabling much predictable planetary rollout of caller SKUs.
How co-design leads to idiosyncratic benefits
The NVIDIA Rubin level marks a large measurement guardant successful accelerated computing, and Azure’s AI datacenters and superfactories are already engineered to instrumentality afloat advantage. Years of co-design with NVIDIA crossed interconnects, representation systems, thermals, packaging, and rack standard architecture means Rubin integrates straight into Azure’s level without rework. Rubin’s halfway assumptions are already reflected successful our networking, power, cooling, orchestration, and pod speech plan principles. This alignment gives customers contiguous benefits with faster deployment, faster scaling, and faster interaction arsenic they physique the adjacent epoch of large-scale AI.