CoreWeave has become one of the first cloud providers to bring NVIDIA GB200 NVL72 systems online at scale, enabling cutting-edge AI companies — including Cohere, IBM, and Mistral AI — to train and deploy the next generation of AI models and applications.
As the first cloud provider to make NVIDIA Grace Blackwell GPUs generally available, CoreWeave has already demonstrated breakthrough performance in MLPerf benchmarks with the GB200 NVL72 — a powerful rack-scale accelerated computing platform built for AI inference and agentic workloads. Now, thousands of NVIDIA Blackwell GPUs are operational within CoreWeave’s infrastructure, unlocking new levels of performance and scalability for its customers.
“We work closely with NVIDIA to rapidly deliver the latest and most powerful solutions for AI training and inference,” said Mike Intrator, CEO of CoreWeave. “With the new Grace Blackwell rack-scale systems now live, our customers are among the first to experience the performance gains of AI innovation at massive scale.”
These Blackwell-powered systems are transforming cloud data centers into AI factories — converting raw data into real-time intelligence with unprecedented speed, accuracy, and efficiency.

Personalized AI Agents
Cohere is leveraging Grace Blackwell Superchips to develop secure, enterprise-grade AI applications using its proprietary platform, North. This platform enables teams to build personalized AI agents that automate workflows, surface real-time insights, and more.
Thanks to CoreWeave’s GB200 NVL72 infrastructure, Cohere is already seeing up to 3x training performance for 100-billion-parameter models compared to previous-generation Hopper GPUs — even without Blackwell-specific optimizations.
By utilizing GB200 NVL72’s unified memory, FP4 precision, and a tightly integrated 72-GPU NVLink fabric, Cohere is achieving significantly higher throughput and faster time to token generation, resulting in more cost-effective inference.
“With access to some of the first NVIDIA GB200 NVL72 systems in the cloud, we’re thrilled with how easily our workloads transition to the Grace Blackwell architecture,” said Autumn Moulder, VP of Engineering at Cohere. “This unlocks tremendous performance efficiency across our stack — from single-GPU inference to massive training jobs spanning thousands of GPUs. We expect even greater gains as we continue to optimize.”

Enterprise AI at Scale
IBM is harnessing one of the first large-scale GB200 NVL72 deployments — spanning thousands of GPUs on CoreWeave — to train its next-generation Granite models, a family of open-source, enterprise-ready AI models designed for performance, safety, and efficiency.
Granite models underpin tools like IBM watsonx Orchestrate, which allows businesses to build AI agents that streamline enterprise workflows. IBM’s use of CoreWeave’s infrastructure is further enhanced by the IBM Storage Scale System, providing high-performance storage specifically designed for large-scale AI workloads.
“We’re excited by the acceleration NVIDIA GB200 NVL72 brings to our Granite model training,” said Sriram Raghavan, VP of AI at IBM Research. “This collaboration with CoreWeave empowers us to build high-performance, cost-effective models that drive enterprise and agentic AI solutions through IBM watsonx.”
Open-Source Innovation
Mistral AI, the Paris-based open-source AI leader, is also tapping into CoreWeave’s new infrastructure — bringing online its first 1,000 Blackwell GPUs to build the next wave of powerful language models.
Mistral is using GB200 NVL72 systems equipped with NVIDIA Quantum InfiniBand networking to accelerate the development and deployment of models like Mistral Large, known for its strong reasoning capabilities.
“Right out of the box and without further tuning, we observed a 2x performance boost for dense model training,” said Thimothee Lacroix, Co-founder and CTO of Mistral AI. “What’s exciting about NVIDIA GB200 NVL72 is the new frontier it opens for both model development and inference.”
Scaling Toward the Future
Beyond dedicated customer deployments, CoreWeave now offers cloud instances featuring full rack-scale NVIDIA NVLink connectivity across 72 Blackwell GPUs and 36 Grace CPUs — all connected via NVIDIA Quantum-2 InfiniBand. These systems can scale to 110,000 GPUs, providing the raw compute needed to drive the next era of AI reasoning and intelligent agents.
With the NVIDIA GB200 NVL72 platform at its core, CoreWeave is enabling AI developers to build faster, scale further, and innovate without limits.