Geographically Distributed Hyperscale AI Fabrics

Despoina Triantafyllidou
05 May 2026
Artificial Intelligence
Data Center
Geographically Distributed Hyperscale AI Fabrics

Context 

Hyperscale cloud providers are increasingly moving away from the assumption that a data centre is a single, self contained physical entity. An emerging architectural trend is to extend the hyperscale data centre core across multiple, geographically separated locations. This shift is driven by a combination of resilience, scalability, operational efficiency, and economic considerations, and is beginning to reshape the topology of AI network fabrics. 

From single-site data centre to geo-distributed fabric core 

Traditionally, hyperscale data centres are deployed on a single site. Their architecture is optimised for east-west traffic and high-bandwidth, low-latency communication within the confined physical location. This design favours performance within the data centre, but it offers limited options for disaster recovery or large-scale fault isolation. A single-site data centre has limited failover potential and exposes AI workloads to failures at the facility level. 

To address these limitations, hyperscalers are now extending the AI fabric across multiple data centre locations. Architecturally, this is reflected in a transition from classic two-layer leaf–spine topologies to three-layer leaf–spine–superspine designs. In this extended-core model, individual data centres retain their frontend and internal leaf–spine fabrics, while a superspine layer interconnects multiple sites using data centre interconnect (DCI) links that typically traverse wide area networks (WANs). 

This evolution effectively turns the WAN into an integral part of the AI fabric rather than a peripheral transport layer, with direct implications for performance, reliability, and traffic engineering. 

Use cases driving geo-distributed AI fabrics 

Several use cases are accelerating the adoption of geographically distributed hyperscale AI fabrics. 

Resilience and failover 

The scope of failover is expanding from individual switch or link failures to full data centre level failover. Beyond basic redundancy, hyperscalers are interested in intelligent traffic engineering across sites, to cater for controlled redistribution of AI workloads in response to faults, congestion, or policy decisions. 

Load balancing 

For large-scale AI systems, distributing traffic across multiple data centres also allows for additional load balancing options. This supports more efficient utilisation of network and compute resources, while reducing hotspots and improving overall system robustness. 

GPU overflow and capacity extension 

GPU availability is often a constraint for AI workload execution. Geo-distributed data centres allow hyperscalers to burst or overflow asynchronous workloads such as inference or async training (certain MoE models allow this) into remote data centres, when local compute is exhausted. Geographically separate GPU pools are a shared resource, for this purpose. 

Hyperscale-across scenarios  

Looking forward, similar principles may be applied across multiple hyperscale clouds. While still nascent, interconnecting AI fabrics across different cloud providers opens the door to multi-hyperscaler architectures for resilience, regulatory, or commercial reasons, for example multi-cloud networking as a managed service. 

Example: https://www.sdxcentral.com/news/aws-and-google-cloud-unite-for-hyperscale-across-cloud/ 

Power, cooling, and economic drivers 

Power consumption is one of the main drivers behind this architectural shift. AI data centres face exceptionally high and sustained power demand, with energy costs representing significant recurring OpEx. It is difficult to secure uninterrupted 100MW+ power blocks at a single site, which alone is a driver for geo-distribution. 

Geo-distributed AI fabrics create new opportunities to optimise around power economics. Hyperscaler could, in principle, can dynamically redirect portions of compute workloads to locations with more favourable electricity pricing, lower cooling requirements, or surplus of renewable energy. Concepts such as “follow-the-sun” or “follow-the-power” compute become feasible, where training or inference are preferentially executed in regions with lower demand or off-peak energy costs. 

Such strategies are only viable if they do not compromise AI performance. This places stringent requirements on the underlying network, particularly on WAN DCI links, that now sit directly in the critical path of the AI fabric. 

Implications for testing and validation 

As AI fabrics extend across data centres connected over WAN links, realistic testing becomes essential. Validating the AI backend now requires emulating the long-tail latency of DCI links. A 2ms jitter spike on a WAN link can trigger a GPU stall in a 100K GPU cluster, which costs thousands of dollars in wasted compute. It is no longer sufficient to validate AI networks within single-site facilities. Hyperscalers must be able to perform realistic testing modelling of WAN behaviour and failure scenarios. 

Accurately emulating these conditions is key to understand how distributed AI workloads behave under real-world constraints, and to ensure that architectural decisions around geo-distribution, failover, and power engineering can be deployed with confidence. 

Conclusion 

Architectures for geographically distributed hyperscale AI fabrics are new in data centre core design. Hyperscalers are addressing resilience, scalability, and economic considerations in parallel. This evolution introduces more technical complexity, particularly around performance predictability, as an AI network across multiple locations must assume that the WAN component is not a critical link. The need for network and workload testing that reflects the realities of distributed AI infrastructure is more relevant than ever.