CHIPS.SANJAY920.COM FIELD GUIDE v0.1

How AI is actually built.

From tin droplets in San Diego to gigawatt data centers in West Texas — a field guide to every layer of the GPU supply chain, every binding constraint, and every bet on the table between now and 2030.

Begin reading Jump to the stack

3.5

EUV tools / GW

the ratio that sets the ceiling

20–30 GW

annual EUV flow

70–100 tools / yr ÷ 3.5 tools / GW

20×

Hopper → Blackwell

real-world inference speedup

$13 B

yearly rent / GW

at current market pricing

Scroll to descend the stack

01 THE STACK
02 BOTTLENECKS
03 BETS
04 CHINA
05 GEOGRAPHY
06 CAST
07 TRACE

LIVE STATUS THE SUPPLY CHAIN RIGHT NOW

Updated

MAY 2026 · this quarter

Today the chain breaks at

HBM Memory.

HBM bandwidth depends on how much memory interface fits around the package.

Pressure

92 / 100

’23
’24
’25
’26
’27
’28

Why it’s binding

HBM4 stack prices have tripled in twelve months. SK Hynix and Samsung are sold out through Q3 2027 on three-year contracts. New DRAM fabs from 2025 decisions don’t produce meaningful capacity until late 2027.

TL;DR THE NUMBERS THAT MATTER

AI compute is built through a chain of physical constraints.

Twelve numbers anyone who follows this industry should be able to recall. Read this first; everything below is derivation.

3.5

EUV tools / GW

the ratio that sets the ceiling

Calc 1 GW of AI capacity ÷ ~285 MW served per EUV tool ≈ 3.5 tools

20–30 GW

annual EUV flow

70–100 tools / yr ÷ 3.5 tools / GW

Calc 70–100 EUV tools shipped / yr ÷ 3.5 tools / GW ≈ 20–29 GW / yr

20×

Hopper → Blackwell

real-world inference speedup

Source NVIDIA · Blackwell

$13 B

yearly rent / GW

at current market pricing

Calc 1 GW ≈ 1M GPUs × $3 / GPU-hour × 8,760 hours / yr × 50% utilization ≈ $13 B / yr

EUV tools / year

ASML’s 2026 production rate

Source ASML 2025 annual report

$400 M

price per EUV tool

by end of decade, ~$500 M

Source ASML investor day

≈90%

Taiwan’s share

of leading-edge logic capacity

Source SIA / BCG supply-chain report

2.5 TB/s

HBM4 stack bandwidth

per 13 mm of chip edge

Calc 8 channels × 256 IO / channel × 10 GT/s ÷ 8 bits / byte ≈ 2.5 TB / s

30%

of capex → memory

hyperscalers, 2026

Calc ~$180 B hyperscaler memory spend ÷ ~$600 B big-four capex ≈ 30%

↓ 600 M

smartphones in 2027

projected, from 1.1 B in 2024

Source IDC smartphone forecast

$600 B

big four capex 2026

AWS · Google · MSFT · Meta

Calc MSFT $190 B + GOOGL $185 B + AMZN $128 B + META $135 B ≈ $638 B (headline rounded down)

<1 yr

data-center build

AWS Project Rainier · announce → live

Source AWS · Project Rainier launch

§ 01 THE STACK

Nine layers between sand and a sentence.

The site follows the physical route of AI compute: light, wafers, packages, memory, accelerators, racks, data centers, cloud economics, and finally the model experience users see.

Layers

From EUV light at the top to the model users touch at the bottom.

Binding now

HBM Memory

The layer where the chain breaks first this quarter — tracked in § 02.

§ 01b THE GUIDE DEEP-DIVE EXPLAINERS

The core layers, explained from fabrication to deployment.

Each section below introduces one layer, one figure, and the main constraint that shapes it.

01 · EUV Lithography

How a tin droplet becomes a chip pattern.

Extreme ultraviolet lithography is the light-based machine step that prints the smallest chip patterns.

A laser strikes tin droplets to make 13.5 nm light. That light prints the smallest features on advanced logic chips.

Native unit

λ 13.5 nm · watts

Constraint

Tool output is limited by how many EUV scanners can be built and qualified each year.

Open chapter

A

TIN DROPLET GENERATOR

30 µm

droplet diameter · 2 drops per second

B

CO₂ LASER

100 MW

peak power · pre-pulse + main pulse

C

PLASMA

200 000 K

hotter than the sun · emits 13.5 nm EUV

D

COLLECTOR MIRROR

Wolter-type

first multilayer Bragg reflector

SOURCE VESSEL

OPTICS COLUMN

VACUUM CHAMBER  ·  10⁻⁴ Pa

A

TIN DROPLET

30 µm  ·  2 drops/sec

B

CO₂ LASER

100 MW peak

C

PLASMA

200 000 K  ·  13.5 nm

D

COLLECTOR MIRROR

Wolter-type  ·  multilayer

1 m  ·  tool scale

focus point

13.5 nm

EUV BEAM

5 nm chip pattern →

12 m

3 m

ILLUMINATOR  ·  4 mirrors  ·  Köhler

RETICLE STAGE  ·  4× mask  ·  9 G scan

PROJECTION  ·  6× Bragg  ·  NA 0.33

E

F

G

H

WAFER STAGE  ·  300 mm  ·  75 wph

E

ILLUMINATOR

Köhler

flat-field illumination across mask

F

RETICLE STAGE

9 G

scan acceleration · sub-nm overlay

G

PROJECTION OPTICS

4× reduction

6 Bragg mirrors · NA 0.33

H

WAFER STAGE

75 wph

wafers exposed per hour

A

TIN DROPLET GENERATOR

30 µm

droplet diameter · 2 drops per second

B

CO₂ LASER

100 MW

peak power · pre-pulse + main pulse

C

PLASMA

200 000 K

hotter than the sun · emits 13.5 nm EUV

D

COLLECTOR MIRROR

Wolter-type

first multilayer Bragg reflector

SOURCE VESSEL

OPTICS COLUMN

VACUUM CHAMBER  ·  10⁻⁴ Pa

A

TIN DROPLET

30 µm  ·  2 drops/sec

B

CO₂ LASER

100 MW peak

C

PLASMA

200 000 K  ·  13.5 nm

D

COLLECTOR MIRROR

Wolter-type  ·  multilayer

1 m  ·  tool scale

focus point

13.5 nm

EUV BEAM

5 nm chip pattern →

12 m

3 m

ILLUMINATOR  ·  4 mirrors  ·  Köhler

RETICLE STAGE  ·  4× mask  ·  9 G scan

PROJECTION  ·  6× Bragg  ·  NA 0.33

E

F

G

H

WAFER STAGE  ·  300 mm  ·  75 wph

E

ILLUMINATOR

Köhler

flat-field illumination across mask

F

RETICLE STAGE

9 G

scan acceleration · sub-nm overlay

G

PROJECTION OPTICS

4× reduction

6 Bragg mirrors · NA 0.33

H

WAFER STAGE

75 wph

wafers exposed per hour

02 · Wafer Fabrication

How many useful accelerator dies fit on one wafer?

A polished silicon disc is patterned into many individual chip dies, but not all of them survive.

A 300 mm wafer offers many candidate reticle sites, but edge exclusion and defects reduce the number of good large dies.

Native unit

858 mm² · % yield

Constraint

Wafer economics depend on how many large dies survive edge loss and defects.

Open chapter

300 mm WAFER  ·  TSMC N3  ·  144 SLOTS → 80 TESTABLE → 75 GOOD

Ø 300 mm

5 of 80 testable dies fail binning  ·  93.8 % yield

orientation notch  ·  south meridian

each die: 25 × 21 mm  ·  525 mm²  ·  typical Rubin-class logic die

YIELD ARITHMETIC  ·  WHERE 69 DIES GO

GROSS

144

reticle slots

−

OFF WAFER

32

outside Ø 300

−

EDGE EXC.

32

2 mm ring

−

BIN FAIL

5

kill bin

=

75

good Rubin dies per 300 mm wafer

≈ 93.8 % bin yield  ·  $200 silicon cost per good die (TSMC N3 ≈ $15 000 wafer)

LITHOGRAPHY SEQUENCE  ·  70 MASKS  ·  ~3 MONTHS IN THE FAB

WELL & ISO

8

FIN & GATE

12

CONTACT

6

M0–M3  EUV

20

M4–M14  DUV

24

20 × 13.5 nm EUV passes

every die meets an ASML NXE:3800E twenty times

~3 months in the fab. ~$13 500 per wafer. The EUV section is upstream Layer 01 — the binding constraint that caps every fab on Earth at ~100 tools per year.

300 mm WAFER  ·  TSMC N3  ·  144 SLOTS → 80 TESTABLE → 75 GOOD

Ø 300 mm

5 of 80 testable dies fail binning  ·  93.8 % yield

orientation notch  ·  south meridian

each die: 25 × 21 mm  ·  525 mm²  ·  typical Rubin-class logic die

YIELD ARITHMETIC  ·  WHERE 69 DIES GO

GROSS

144

reticle slots

−

OFF WAFER

32

outside Ø 300

−

EDGE EXC.

32

2 mm ring

−

BIN FAIL

5

kill bin

=

75

good Rubin dies per 300 mm wafer

≈ 93.8 % bin yield  ·  $200 silicon cost per good die (TSMC N3 ≈ $15 000 wafer)

LITHOGRAPHY SEQUENCE  ·  70 MASKS  ·  ~3 MONTHS IN THE FAB

WELL & ISO

8

FIN & GATE

12

CONTACT

6

M0–M3  EUV

20

M4–M14  DUV

24

20 × 13.5 nm EUV passes

every die meets an ASML NXE:3800E twenty times

~3 months in the fab. ~$13 500 per wafer. The EUV section is upstream Layer 01 — the binding constraint that caps every fab on Earth at ~100 tools per year.

03 · Advanced Packaging

Why packaging matters as much as the chip.

This is the assembly step that joins separate silicon pieces and memory into one fast compute package.

The package connects compute dies to HBM through substrates, bridges, and interposers. Those layers determine bandwidth and yield.

Native unit

μm stack height · mm shoreline

Constraint

Advanced packaging capacity determines how many compute dies can be paired with HBM at scale.

Open chapter

FIG. 05

Dies per package, by generation

NVIDIA

Hopper

1 die

2023

Blackwell

2 dies

2025

Rubin Ultra

4 dies

2027

Each generation roughly doubles the silicon you can wire together at terabytes-per-second speeds before paying a network tax. Dojo took this idea to its limit — a wafer-scale chip with 25 dies, killed by transformer-unfriendly memory layout.

FIG. 05

Dies per package, by generation

NVIDIA

Hopper

1 die

2023

Blackwell

2 dies

2025

Rubin Ultra

4 dies

2027

Each generation roughly doubles the silicon you can wire together at terabytes-per-second speeds before paying a network tax. Dojo took this idea to its limit — a wafer-scale chip with 25 dies, killed by transformer-unfriendly memory layout.

04 · HBM Memory Binding now

How HBM gets to 2.5 TB/s.

High Bandwidth Memory is stacked DRAM placed beside the chip so data can reach it quickly.

Stacked DRAM, a base die, and thousands of IO lines add up to the bandwidth modern accelerators depend on.

Native unit

GB/s · TSVs · 12-high

Constraint

HBM bandwidth depends on how much memory interface fits around the package.

Open chapter

PHYSICAL STACK   ·   13 DIES PER HBM4

D12

DRAM · 2 GB

D11

DRAM · 2 GB

D10

DRAM · 2 GB

D09

DRAM · 2 GB

D08

DRAM · 2 GB

D07

DRAM · 2 GB

D06

DRAM · 2 GB

D05

DRAM · 2 GB

D04

DRAM · 2 GB

D03

DRAM · 2 GB

D02

DRAM · 2 GB

D01

DRAM · 2 GB

HBM4 BASE DIE

logic · I/O controller · 2 048 IO

16 channels   ×   128 IO lines per channel   =   2 048 IO lines total

BANDWIDTH FORMULA   ·   WIDTH × RATE ÷ ENCODING

WIDTH

2 048

IO lines

from the figure

×

RATE

10

GT/s per pin

HBM4 spec

÷

ENCODING

8

bits per byte

unit conversion

=

2 560 GB/s

≈ 2.5 TB/s per HBM4 stack

WHAT THIS BUYS YOU

A Rubin GPU pulls roughly 20 TB/s from its 8 HBM4 stacks. That is enough memory bandwidth to read a 100-billion-parameter model from HBM about a hundred times every second — the operation that produces a single token. Memory bandwidth, not compute, is what limits how fast a model can think.

PHYSICAL STACK   ·   13 DIES PER HBM4

D12

DRAM · 2 GB

D11

DRAM · 2 GB

D10

DRAM · 2 GB

D09

DRAM · 2 GB

D08

DRAM · 2 GB

D07

DRAM · 2 GB

D06

DRAM · 2 GB

D05

DRAM · 2 GB

D04

DRAM · 2 GB

D03

DRAM · 2 GB

D02

DRAM · 2 GB

D01

DRAM · 2 GB

HBM4 BASE DIE

logic · I/O controller · 2 048 IO

16 channels   ×   128 IO lines per channel   =   2 048 IO lines total

BANDWIDTH FORMULA   ·   WIDTH × RATE ÷ ENCODING

WIDTH

2 048

IO lines

from the figure

×

RATE

10

GT/s per pin

HBM4 spec

÷

ENCODING

8

bits per byte

unit conversion

=

2 560 GB/s

≈ 2.5 TB/s per HBM4 stack

WHAT THIS BUYS YOU

A Rubin GPU pulls roughly 20 TB/s from its 8 HBM4 stacks. That is enough memory bandwidth to read a 100-billion-parameter model from HBM about a hundred times every second — the operation that produces a single token. Memory bandwidth, not compute, is what limits how fast a model can think.

05 · Accelerator Die

Why modern accelerators are becoming multi-die systems.

This is the main compute silicon that performs the heavy parallel math behind AI workloads.

The accelerator is no longer just one rectangle of silicon. It is a tightly integrated package of logic, memory interfaces, and die-to-die links.

Native unit

transistors · tFLOPS

Constraint

Large accelerators are constrained by reticle area, package layout, and the yield of multi-die assembly.

Open chapter

FIG. 07

The accelerators of 2026–27

5 ENTRIES

CHIP

MAKER

NODE

MEM

FP4 PF

Rubin

NVIDIA

N3

HBM4

~5

TPU v7

Google

N3

HBM3e

~3

Trainium 3

AWS

N3

HBM3e

~2.5

MI400

AMD

N2

HBM4

~4

Ascend 910D

Huawei

7 nm

HBM2e

≈0.8

Huawei is on 7 nm and still in the conversation. Most of the gap to Rubin is networking and packaging — the levers China can keep pulling without ASML.

FIG. 07

The accelerators of 2026–27

5 ENTRIES

CHIP

MAKER

NODE

MEM

FP4 PF

Rubin

NVIDIA

N3

HBM4

~5

TPU v7

Google

N3

HBM3e

~3

Trainium 3

AWS

N3

HBM3e

~2.5

MI400

AMD

N2

HBM4

~4

Ascend 910D

Huawei

7 nm

HBM2e

≈0.8

Huawei is on 7 nm and still in the conversation. Most of the gap to Rubin is networking and packaging — the levers China can keep pulling without ASML.

06 · Scale-up Rack

How 72 GPUs behave like one machine.

A rack wires many accelerator trays together so they can behave more like one larger system.

NVL72 combines compute trays, switches, copper links, power, and cooling into a rack-scale training unit.

Native unit

GPUs / pod · NVLinks

Constraint

Rack-scale performance depends on how many GPUs can communicate as one coherent system.

Open chapter

FIG. 08

Three ways to wire a rack

TOPOLOGIES

NVL72

NVIDIA

all-to-all · 72 GPUs · ≈1 MW

TPU Pod

GOOGLE

3D torus · 8,000+ chips · 6 neighbors

Trainium Pod

AWS

hybrid → dragonfly · converging on NVL-class scale

every chip connects to every chip

6 neighbors, bounce to reach far chips

clusters of all-to-all, sparser between

FIG. 08

Three ways to wire a rack

TOPOLOGIES

NVL72

NVIDIA

all-to-all · 72 GPUs · ≈1 MW

TPU Pod

GOOGLE

3D torus · 8,000+ chips · 6 neighbors

Trainium Pod

AWS

hybrid → dragonfly · converging on NVL-class scale

every chip connects to every chip

6 neighbors, bounce to reach far chips

clusters of all-to-all, sparser between

07 · Data Center

What it takes to power a dense AI data center.

This is the building-scale layer: substations, power conversion, cooling, and deployment timing.

As racks get denser, data-center design shifts from floor space to substations, conversion losses, and cooling.

Native unit

MW · PUE

Constraint

New AI capacity depends on power delivery, interconnect timing, and cooling infrastructure.

Open chapter

FIG. L07

Power sources for new AI capacity, 2026–30

PROJECTED

Combined-cycle gas

40%

Aeroderivatives

18%

Reciprocating engines

12%

Solar + battery

10%

Bloom fuel cells

8%

Nuclear / grid

12%

Half of new capacity by 2030 will be behind-the-meter — expensive, but unblocked by interconnection queues and able to come online in months instead of years.

FIG. L07

Power sources for new AI capacity, 2026–30

PROJECTED

Combined-cycle gas

40%

Aeroderivatives

18%

Reciprocating engines

12%

Solar + battery

10%

Bloom fuel cells

8%

Nuclear / grid

12%

Half of new capacity by 2030 will be behind-the-meter — expensive, but unblocked by interconnection queues and able to come online in months instead of years.

08 · Cloud Layer

How hardware becomes cloud revenue.

This layer turns expensive servers into rented compute capacity with a payback clock attached.

This layer connects spending on compute infrastructure to rental pricing, utilization, and payback time.

Native unit

$ / hr-Hopper · depreciation

Constraint

Cloud economics depend on capex, utilization, pricing, and how quickly hardware depreciates.

Open chapter

FIG. 08  ·  LAYER 08  ·  CLOUD CAPITAL

$480 billion goes in. $57 billion comes back.

For every dollar deployed at the top of this chart, $0.46 becomes NVIDIA silicon and $0.12 comes back as annual revenue. An 8× capex-to-revenue multiplier — either the largest generational infrastructure bet in history, or the largest capex bubble. Numbers at Q4 2026 run-rate.

CAPITAL  ·  STAGE 01

$480 B

deployed per year into AI compute

SILICON  ·  STAGE 02

$220 B

NVIDIA data-center revenue

REVENUE  ·  STAGE 03

$57 B

actual money earned from compute

BIG-TECH OPERATING CF

$330 B

Microsoft · Google · Meta · Amazon · Oracle — the four-and-a-half-firm club that earns enough on its existing business to self-fund a global GPU buildout. ~68% of all annual AI capital.

SOVEREIGN WEALTH  ·  $45 B

PIF · MGX · GIC · Mubadala · ADIA

PRIVATE CREDIT + DEBT  ·  $55 B

Blue Owl · Carlyle · Apollo · Blackstone
incl. GPU-backed asset-finance vehicles

AI-LAB EQUITY ROUNDS  ·  $50 B

OpenAI · Anthropic · xAI · Mistral
plus secondaries & employee tender

NVIDIA GROSS MARGIN

$165 B

The single largest claim on the AI dollar. Captured by one company before the chip is even installed in a rack. Data-center gross margin ~75% on ~$220 B revenue.

SILICON COGS  ·  $55 B

TSMC wafers · SK Hynix / Samsung / Micron HBM
ASE / Amkor packaging · substrate & test

API revenue  ·  $22 B

OpenAI · Anthropic · et al.

Consumer subs  ·  $12 B

ChatGPT Plus · Claude Pro

Embedded prods  ·  $8 B

Copilot · Cursor · GitHub

Enterprise  ·  $15 B

per-seat & per-call contracts

WHERE THE OTHER 54¢ GOES

Of every $1 of CAPITAL flowing in (above), $0.46 becomes NVIDIA silicon. The other $0.54 — about $260 B annually — buys land, power, networking, copper cabling, real estate, labor, and the debt service on all of it.

orange  ·  NVIDIA gross margin captured

|

grey bands  ·  flow at 1.25 px / $B  ·  heights to scale

FIG. 08  ·  CAPITAL  ·  Q4 2026  ·  v1.0

FIG. 08  ·  LAYER 08  ·  CLOUD CAPITAL

$480 billion goes in. $57 billion comes back.

For every dollar deployed at the top of this chart, $0.46 becomes NVIDIA silicon and $0.12 comes back as annual revenue. An 8× capex-to-revenue multiplier — either the largest generational infrastructure bet in history, or the largest capex bubble. Numbers at Q4 2026 run-rate.

CAPITAL  ·  STAGE 01

$480 B

deployed per year into AI compute

SILICON  ·  STAGE 02

$220 B

NVIDIA data-center revenue

REVENUE  ·  STAGE 03

$57 B

actual money earned from compute

BIG-TECH OPERATING CF

$330 B

Microsoft · Google · Meta · Amazon · Oracle — the four-and-a-half-firm club that earns enough on its existing business to self-fund a global GPU buildout. ~68% of all annual AI capital.

SOVEREIGN WEALTH  ·  $45 B

PIF · MGX · GIC · Mubadala · ADIA

PRIVATE CREDIT + DEBT  ·  $55 B

Blue Owl · Carlyle · Apollo · Blackstone
incl. GPU-backed asset-finance vehicles

AI-LAB EQUITY ROUNDS  ·  $50 B

OpenAI · Anthropic · xAI · Mistral
plus secondaries & employee tender

NVIDIA GROSS MARGIN

$165 B

The single largest claim on the AI dollar. Captured by one company before the chip is even installed in a rack. Data-center gross margin ~75% on ~$220 B revenue.

SILICON COGS  ·  $55 B

TSMC wafers · SK Hynix / Samsung / Micron HBM
ASE / Amkor packaging · substrate & test

API revenue  ·  $22 B

OpenAI · Anthropic · et al.

Consumer subs  ·  $12 B

ChatGPT Plus · Claude Pro

Embedded prods  ·  $8 B

Copilot · Cursor · GitHub

Enterprise  ·  $15 B

per-seat & per-call contracts

WHERE THE OTHER 54¢ GOES

Of every $1 of CAPITAL flowing in (above), $0.46 becomes NVIDIA silicon. The other $0.54 — about $260 B annually — buys land, power, networking, copper cabling, real estate, labor, and the debt service on all of it.

orange  ·  NVIDIA gross margin captured

|

grey bands  ·  flow at 1.25 px / $B  ·  heights to scale

FIG. 08  ·  CAPITAL  ·  Q4 2026  ·  v1.0

09 · AI Lab → Model

How tokens inherit the cost of everything underneath them.

Infrastructure cost finally becomes model training, inference, and the price of serving tokens.

The lab is where the hardware supply chain turns into a user-facing service, with compute cost flowing into every token served.

Native unit

tokens / s · gross margin

Constraint

At the end of the supply chain, token economics are shaped by inference efficiency and compute cost.

Open chapter

FIG. 11

Compute footprint of the frontier labs, EOY 2026

GIGAWATTS

OpenAI

~7 GW

Anthropic

~6 GW

Google DeepMind

~5 GW

Meta

~4 GW

xAI

~2 GW

By end of 2027, the top two labs are projected to cross 10 GW each. China’s leading labs are not building at this scale — yet.

FIG. 11

Compute footprint of the frontier labs, EOY 2026

GIGAWATTS

OpenAI

~7 GW

Anthropic

~6 GW

Google DeepMind

~5 GW

Meta

~4 GW

xAI

~2 GW

By end of 2027, the top two labs are projected to cross 10 GW each. China’s leading labs are not building at this scale — yet.

§ 02 BOTTLENECKS

The bottleneck never sits still.

Every twelve to eighteen months, the binding constraint on AI compute moves to a different layer of the stack. Knowing where it will sit next is the difference between a great compute contract and a bad one.

Read each column as a year: packaging was yesterday’s constraint, memory is today’s, EUV and power are tomorrow’s.

EUV

100 / yr · ASML cap
WAFER

75 wph · N3
PACKAGING

CoWoS-L · TSMC cap
HBM

12-hi · 2.5 TB/s
ACCELERATOR

Multi-die package
SCALE-UP

NVL72 · 72 GPU
DATA CENTER

20 GW / yr · US
CLOUD

$480B → $57B · 8×

today’s binding artifact the seven other layers

Intensity main limiter tightening background

Layer

’23

’24

’25

’26

’27

’28

’29

’30

EUV Lithography

Wafer Fabs

Packaging (CoWoS)

HBM Memory

GPU Die

Scale-up Rack

Data Center

Power

What drove each transition

CoWoS bottleneck.

TSMC’s chip-on-wafer-on-substrate line was capped at a few thousand units a month while NVIDIA H100 demand quintupled. Every shipment delayed.

Long context met memory.

KV caches read on every token, sparse MoE multiplying memory demand, smartphones bidding for the same DRAM wafers. SK Hynix tripled HBM prices.

ASML capacity binds.

At ~70–100 tools/year and 3.5 tools per gigawatt, the simple flow math supports only ~20–30 GW/yr of new AI capacity before stock, allocation, and node assumptions. The end-of-decade ceiling shows up here first.

Grid won’t connect you.

Interconnection queues stretched to seven years in PJM. Turbine deposits all booked. Behind-the-meter gas became the standard answer.

§ 03 OPEN HYPOTHESES

Eight open hypotheses, tracked through resolution.

Specific, dated, falsifiable claims about which constraints will bind, which players will surprise, and which conventional wisdoms break by 2030. Each carries a confidence label; the “non-consensus” tag marks claims where mainstream supply-chain analysis disagrees.

Hypothesis 01

high confidence

Memory stays the binding constraint through 2028.

DRAM fabs take two years to build and the memory makers only started building in 2025. Even with smartphone volumes halving, HBM demand outruns supply.

Hypothesis 02

uncertain

ASML flow math is tighter than the 200 GW headline.

At 70–100 EUV tools/year and 3.5 tools per gigawatt, simple annual flow is ~20–30 GW/year before installed-base, allocation, and node assumptions. Any larger ceiling needs an explicit stock model.

Hypothesis 03

non-consensus

Apple becomes a minority customer of TSMC by 2028.

Already squeezed off N3 majority. By A16 the first customer is AI, not iPhone. TSMC’s focus follows margin, and AI clears every bar.

Hypothesis 04

non-consensus

H100 prices are higher in 2027 than they were in 2024.

Models that run on Hoppers got smarter and cheaper to serve. The depreciation cycle is longer than five years, not shorter, so the bear thesis that Hopper-class compute is rapidly worth less does not hold.

Hypothesis 05

uncertain

China gets indigenous EUV working by 2030 — not in volume.

Working tools in labs by end of decade. Mass-production hell takes another 5–7 years. China’s real catch-up is in DUV and packaging.

Hypothesis 06

non-consensus

Space data centers are a 2035 problem, not a 2030 problem.

Free power saves 10–15% of TCO. Six months of deployment delay costs more. While chips are the bottleneck, earth wins.

Hypothesis 07

uncertain

Robots offload thinking to the cloud, not to the chest cavity.

Long-horizon planning batches in data centers; only reflexes run on-board. Centralized intelligence drives decentralized motion.

Hypothesis 08

speculative

Fast AGI timelines favor the U.S. supply chain; long timelines favor China’s.

If revenue compounds fast enough that the next model is built on the last one’s gross margin, the West’s dispersed supply chain wins. If not, China’s vertically integrated one does.

§ 04 CHINA & TAIWAN

The single-point-of-failure problem.

Of leading-edge logic

≈90%

made on a single island

Roughly nine-tenths of the world’s leading-edge logic comes from Taiwan. The tools that make it use chips that are also made in Taiwan — a snake eating its own tail.

In the high-growth scenario where Taiwan remains available, annual AI compute buildout can be modeled in the hundreds of gigawatts. If something goes wrong on the island, the same scenario collapses to perhaps 10–20 GW of annual new capacity — the limit of what Intel and Samsung can produce. Existing capacity continues but the growth curve stops, and the supply chain takes a decade to rebuild without Taiwan’s know-how.

Huawei is interesting precisely because it is verticalized: fabs (SMIC), networking, software, talent, AI researchers, end-market — the whole stack in one country. If Huawei had TSMC access, the consensus argument runs, they would be NVIDIA’s biggest competitor. They don’t, and they’re still building anyway.

Counterfactual

Roughly 10×–20× less annual new capacity. Existing fleet keeps running; new growth stalls until Arizona, Korea, and Japan rebuild the missing capacity. Best estimates: a decade.

Two players, one loop. ASML ships EUV scanners down; TSMC supplies the leading-edge logic that goes back into them. ASML is the only supplier of EUV scanners. TSMC is the only fab that can absorb them at leading-edge volume. Either side stalls and the other does too.

§ 05 GEOGRAPHY

Where the world’s compute is actually made.

Critical campuses

The supply chain looks global on a map and concentrated on a list. Every critical layer above traces back to a handful of campuses in a handful of cities.

Veldhoven

ASML EUV lithography tools

Oberkochen

Carl Zeiss projection optics (18 mirrors per tool)

San Diego

Cymer (ASML) EUV source — the tin-droplet laser

Wilton

ASML reticle stages (9 G mechanical scanning)

Hsinchu

TSMC N3 / N2 fabs and CoWoS packaging

Icheon

SK Hynix HBM stacks (NVIDIA’s primary supplier)

Hwaseong

Samsung DRAM + HBM + Foundry

Hiroshima

Micron HBM stacks (Idaho HQ, JP fab)

Hillsboro

Intel 18A fabs (the US backstop bet)

Santa Clara

NVIDIA GPU + system design (no fabs)

Shenzhen + Shanghai

Huawei Ascend, SMIC fabs (DUV only)

Taoyuan

Victory Giant PCBs for the entire industry

§ 06 CAST OF CHARACTERS

Twelve companies, one supply chain.

Profiled

The people you should know by name. Each plays a role no one else can, on a campus most readers couldn’t place on a map.

ASML

Builds the hardest machine humans make. Without ASML, the modern world stops.

TSMC

Ninety percent of leading-edge logic. NVIDIA’s, Apple’s, AMD’s, and Google’s chips all start here.

NVIDIA

Designs the accelerator everyone else buys. Doesn’t own a fab. Owns the customer.

SK Hynix

Was memory’s runner-up. Picked HBM early. Is now NVIDIA’s most important external supplier.

Huawei

The only vertical empire — fabs, networking, AI talent, end markets — all in one country. Sanctioned. Still climbing.

Google

Has had in-house silicon since 2015. Has the largest scale-up domain in the world. Just woke up to AGI.

Anthropic

$4 B → $6 B added in two months. Now compute-constrained on every dimension simultaneously.

OpenAI

Signed deals with every NeoCloud that would have them. Now the largest commercial compute buyer on Earth.

Carl Zeiss

~1,000 artisans polishing the eighteen multilayer mirrors that go into every EUV tool. Sub-nanometer accuracy, by hand.

Crusoe

From flare-gas Bitcoin to OpenAI’s biggest data-center partner. Pioneered behind-the-meter at gigawatt scale.

Microsoft

OpenAI’s landlord. Also the cloud half of “foundry”. Stuck navigating between the two largest customers of AI.

AWS

Trainium 3 ships this year. The only hyperscaler trying to design its own AI silicon at scale outside of Google.

§ 07 TRACE A GPU SUPPLY-CHAIN WALK

Where does your chip come from?

Follow a single accelerator across eight stops, twelve companies, and roughly fourteen weeks of physical motion.

01

Worldwide

Silica sand

Quartz mines in NC, Spruce Pine and elsewhere.
02

Japan

Silicon ingot

Shin-Etsu and SUMCO grow 30 cm boules.
03

Taiwan

N3 wafer

TSMC fab in Hsinchu · 70-mask process.
04

Netherlands → TW

EUV pass

20 EUV layers from ASML tools at TSMC.
05

Taiwan

CoWoS

4 dies bonded to interposer, then to substrate.
06

Korea

HBM4 bonded

8 stacks from SK Hynix Icheon. The big ticket.
07

TW · US

Test & burn-in

ASE / Amkor verify every chip at full clock.
08

TW → US

NVL72 rack

Wiwynn / Foxconn integrate 72 GPUs + NVLink switch.

≈14 weeks

sand → powered rack

8 stops

physically traversed

12 firms

contribute to one chip

modeled

unit cost · product-specific

3 countries

hold single points of failure

How AI is actually built.

HBM Memory.

Nine layers between sand and a sentence.

The core layers, explained from fabrication to deployment.

How a tin droplet becomes a chip pattern.

How many useful accelerator dies fit on one wafer?

Why packaging matters as much as the chip.

How HBM gets to 2.5 TB/s.

Why modern accelerators are becoming multi-die systems.

How 72 GPUs behave like one machine.

What it takes to power a dense AI data center.

How hardware becomes cloud revenue.

How tokens inherit the cost of everything underneath them.

The bottleneck never sits still.

Memory stays the binding constraint through 2028.

ASML flow math is tighter than the 200 GW headline.

Apple becomes a minority customer of TSMC by 2028.

H100 prices are higher in 2027 than they were in 2024.

China gets indigenous EUV working by 2030 — not in volume.

Space data centers are a 2035 problem, not a 2030 problem.

Robots offload thinking to the cloud, not to the chest cavity.

Fast AGI timelines favor the U.S. supply chain; long timelines favor China’s.

Veldhoven

Oberkochen

San Diego

Wilton

Hsinchu

Icheon

Hwaseong

Hiroshima

Hillsboro

Santa Clara

Shenzhen + Shanghai

Taoyuan

Silica sand

Silicon ingot

N3 wafer

EUV pass

CoWoS

HBM4 bonded

Test & burn-in

NVL72 rack