AI cloud infrastructure has split into two distinct tiers: the traditional hyperscalers (AWS, Azure, GCP) offering broad general-purpose clouds, and a newer breed of neocloud providers (CoreWeave, Lambda Labs, Nebius, Crusoe, Nscale) purpose-built around GPU-first architecture for AI workloads. The defining difference is specialization β neoclouds strip away the overhead of legacy services to deliver 40β66% lower GPU costs and faster access to the latest accelerators. The single most important mental model is that AI infrastructure is no longer measured in FLOPS but in tokens per second per dollar, and every architectural choice β from interconnect topology to quantization format β flows from that optimization target.
What This Cheat Sheet Covers
This topic spans 16 focused tables and 107 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Neocloud Providers β Key Players and Differentiators
The neocloud market is no longer a fringe alternative to hyperscalers β Forrester projected neocloud revenue hitting 20 billion in 2026, with Microsoft alone committing 60+ billion across multi-year neocloud contracts. Understanding each provider's focus, scale, and differentiation is the first step in selecting infrastructure for AI workloads.
| Provider | Example | Description |
|---|---|---|
Kubernetes Namespace Capacity; 8Γ H100 nodes with InfiniBand; $21B Meta contract | The largest neocloud; sells GPU capacity as Kubernetes namespace compute rather than traditional VMs; Q1 2026 revenue ~1B; 99.4B backlog; NVIDIA-backed. | |
1-click H100/B200 clusters with InfiniBand; bare metal instances; on-prem private clusters | Positions as the "AI developer cloud" with one-click cluster provisioning; launch partner for NVIDIA Vera CPU and Quantum-X800 InfiniBand; offers bare metal and on-prem private clusters. | |
H100/H200 cloud; $3.50/hr H200; API + Terraform control; Finland data center | Amsterdam-based full-stack AI cloud; projected 750Mβ1B ARR; InfiniBand networking with hourly and reserved billing; building gigawatt-scale AI factories with NVIDIA. | |
Energy-first H100/H200 clusters; Stargate OpenAI Abilene campus (200MW Phase 1) | Energy-first neocloud leveraging stranded natural gas and renewables at 30β50% lower energy costs than hyperscalers; revenues growing from 276M (2024) toward 2B (2026 est.). |