Microsoft Azure has launched the first large-scale production cluster featuring over 4,600 NVIDIA GB300 NVL72 units with Blackwell Ultra GPUs, built to handle the growing demands of OpenAI and other frontier AI workloads. This marks the start of Microsoft’s global rollout of large clusters designed for faster model training and higher efficiency across its AI datacenters.
The new ND GB300 v6 virtual machines (VMs) deliver powerful computing performance with 72 GPUs per rack, 37TB of fast memory, and up to 1,440 petaflops of processing capacity. The system uses NVIDIA’s latest InfiniBand Quantum-X800 network for faster data transfers, along with NVLink and NVSwitch for optimized memory and bandwidth. These improvements help train massive AI models in weeks instead of months.
Azure’s advanced cooling and power systems support the cluster’s high energy density while minimizing water usage. Together with NVIDIA, Microsoft is building the infrastructure needed for the next generation of AI, enabling more responsive, scalable, and efficient model development worldwide.
Leave a comment