How $200 Billion in GPUs Reveal the Physical Constraints of Artificial Intelligence
Artificial intelligence is described as an immaterial technology — algorithms, tokens, cloud. This representation is becoming misleading. Behind every query lies a physical reality: tens of thousands of GPUs, electricity consumption comparable to a mid-sized city, significant water volumes for cooling. This article analyses AI’s shift towards a heavy industry regime, constrained not by algorithmic creativity but by the thermodynamic and political resources of the planet that supports it.
Generative AI is no longer an “augmented software” industry: it structurally tends towards heavy industry. Tens of thousands of GPUs arrayed in datacenters, electricity consumption equivalent to a city of 100,000 to 200,000 inhabitants per hyperscale datacenter, and connections to grids already under strain. The tension between installed capacity and actual usage raises a question the industry has yet to confront head-on: will AI be limited by algorithmic creativity or by the planet’s thermodynamic constraints?
Between 2022 and 2025, hyperscalers collectively committed over $200 billion in compute infrastructure. Microsoft announced $50 billion over four years, Meta $37 billion for 2024 alone, Amazon $75 billion dedicated to AI datacenters. This accumulation follows the logic of real options: under radical uncertainty, building excess capacity amounts to purchasing a strategic option. The cost of this option (temporarily underutilised GPUs) was deemed lower than the risk of being locked out of an emerging market with increasing returns.
Physical constraints manifest immediately: producing an H100 GPU requires 4–5 nm lithography, mastered almost exclusively by TSMC, with 18-to-24-month lead times. GPUs become strategic assets. Concentration is extreme: by late 2024, Microsoft was targeting approximately 1.8 million GPUs, while the entire US university system could not build a version of ChatGPT. Frontier publications are migrating from academic labs to hyperscaler in-house teams.
A new constraint emerges: memory. Modern architectures are becoming memory-bound rather than compute-bound, and the HBM industry (SK Hynix, Samsung, Micron) struggles to keep pace, with estimated price increases of +300% between 2022 and 2024.
Two years after ChatGPT’s launch, a structural gap appears. According to the US Census Bureau, only 9.2% of American companies reported using AI in Q2 2025. A BCG study (2024) shows that 74% of companies extract no tangible value from their AI initiatives. McKinsey confirms: over 80% of organisations report no material impact on financial results. MIT indicates that 95% of pilot projects fail to produce measurable returns — not for technical reasons, but for lack of operational scaling.
McKinsey shows that roughly 70% of obstacles are human and organisational, 20% technological, and only 10% algorithmic. Technical barriers relate to legacy system integration and data quality. Organisational barriers run deeper: only 15% of employees report that their company has communicated a clear AI strategy (Gallup 2024), fewer than 30% of CEOs directly sponsor the AI agenda, and the real TCO of production AI proves 10 to 20 times higher than initial POC costs.
Facing under-consumption, the industry attempts to stimulate demand through aggressive bundling, price wars (OpenAI reduced GPT-3.5 pricing by over 90% between 2023 and 2024), and a massive pivot towards agentic AI. This pivot raises a critical question: why would organisations unable to monetise simple generative AI uses suddenly extract value from systems far more complex to deploy, govern, and secure?
Actual GPU cluster utilisation rates range between 15% and 30%, well below the 60–80% required for profitability. The technological obsolescence cycle (18–24 months) is shorter than accounting depreciation (5–7 years). The 2000–2001 fibre optic overcapacity precedent (WorldCom, Global Crossing) offers an analogous structural pattern.
A hyperscale datacenter typically consumes 150 to 300 MW of continuous load. According to the IEA, datacenters consumed approximately 460 TWh in 2022, with projections of 1,000 to 1,300 TWh by 2030, 40–50% attributable to AI. In Ireland, datacenters accounted for 20% of electricity consumption in 2023. Adding electrical capacity takes time: 3–5 years for a gas plant, 10–15 years for traditional nuclear.
The industry is turning to nuclear: Microsoft signed an agreement for Three Mile Island Unit 1 (~835 MW), an SMR race is underway (AWS targeting 5 GW by 2039, Google ~500 MW via Kairos Power), and TerraPower is developing the Natrium reactor (345 MW). However, SMRs introduce proliferation risks (HALEU uranium at 19.75%, near the military threshold).
A hyperscale datacenter can consume several million litres of water per day. Microsoft reported a 34% increase in water consumption between 2021 and 2022, Google 20%. Approximately 40% of global datacenters are located in areas of medium-to-high water stress (World Resources Institute).
Villalobos et al. (2022) estimate that high-quality text stocks could be exhausted between 2026 and 2032. Recursive training on synthetic data causes model collapse (Shumailov et al., 2023): distributions contract, tails disappear, and the model converges towards an impoverished representation — analogous to genetic inbreeding.
AI becomes geographically determined by the convergence of these constraints: dispatchable low-carbon energy, sustainable hydrological basins, and internet connectivity with acceptable latency. These conditions are rarely met simultaneously.
Jevons’ paradox (1865) applies: efficiency gains reduce the unit cost of compute, but total consumption increases because each improvement expands the scope of economically viable uses. Usage volumes have been multiplied by 50 to 100 between 2022 and 2025, far exceeding unit efficiency gains. A useful heuristic: models consume faster than chips economise.
Agentic AI catalyses this dynamic: shifting to persistent agents increases the duty cycle from 15–30% (human ad hoc usage) to 60–80% (24/7 agents), with an absolute energy consumption increase of ×3 to ×5.
Competitive advantage is being redistributed: rapid diffusion via open source reduces the durability of purely algorithmic advantages, while access to energy, water rights, territorial acceptability, and proprietary data corpora gain in importance. The energy intensity of generative AI (0.40–0.60 kWh per dollar of value added) structurally approaches electrometallurgy. Durable advantage sources become: low-carbon PPA contracts over 10–20 years, secured water rights, and vertical integration into the energy chain.
The advanced compute value chain is fragmented and asymmetric: Taiwan concentrates over 90% of advanced lithography capacity (<7 nm), EUV lithography remains an ASML monopoly, HBM memory depends on East Asia. China holds structural advantages on physical constraints (energy deployment speed, political tolerance for externalities) that could partially compensate for its advanced silicon gap.
HALEU uranium required for SMRs creates a critical dependency on Russia. Submarine cables, whose ownership is shifting from telecom operators to hyperscalers, represent an additional geopolitical vulnerability.
Several dynamics could inflect the trajectory: sparsification, quantization, MoE architectures and distillation reduce inference costs; decentralised inference on NPUs (Apple M4, Qualcomm Snapdragon X) could redistribute load across billions of edge devices; intelligent compute scheduling (carbon-aware computing) reduces marginal carbon footprint without reducing absolute consumption; renewable energy PPAs often operate as accounting reallocation rather than additional decarbonised capacity. The risk of regulatory carbon leakage is real.
For technology companies: secure energy access ahead of market tightening, multi-site geographic diversification, vertical energy integration. For regulators: limited window of action (2025–2027), binding efficiency standards, sectoral carbon markets, mandatory transparency. For territories: trade-off between investment attractiveness and exposure to an energy-intensive industry — the French mega-bassines precedent illustrates the fragility of social acceptability.
The ultimate question is thermodynamic, then immediately political: what share of electricity, water, data, network, and social acceptability will our societies allocate to AI — and at the expense of which other uses? AI does not escape the laws of physics: it forces us to decide explicitly how we choose to pay for them.