AI Power Consumption: Rapidly Becoming Mission-Critical

Beth Kindig
9 min readJun 25, 2024

--

Big Tech is spending tens of billions quarterly on AI accelerators, which has led to an exponential increase in power consumption. Over the past few months, multiple forecasts and data points reveal soaring data center electricity demand, and surging power consumption. The rise of generative AI and surging GPU shipments is causing data centers to scale from tens of thousands to 100,000-plus accelerators, shifting the emphasis to power as a mission-critical problem to solve.

Increasing Power Consumption Per Chip

As Nvidia, AMD, and soon Intel begin to roll out their next generation of AI accelerators, the focus is now shifting towards power consumption per chip, whereas the focus has been primarily on compute and memory. As each new generation boosts computing performance, it also consumes more power than its predecessor, meaning that as shipment volumes rise, so does total power demand.

Nvidia’s A100 max power consumption is 250W with PCIe and 400W with SXM (Server PCIe Express Module), and the H100’s power consumption is up to 75% higher versus the A100. With PCIe, the H100 consumes 300–350W, and with SXM, up to 700W. The 75% increase in GPU power consumption happened rapidly, within two brief years, across one generation of GPUs.

When we look at other GPUs on the market today, AMD’s MI250 accelerators draw 500W of power, up to 560W at peak, while the MI300x consumes 750W at peak, up to a 50% increase. Intel’s Gaudi 2 accelerator consumes 600W, and its successor, the Gaudi 3, consumes 900W, again another 50% increase over the previous generation. Intel’s upcoming hybrid AI processor, codenamed Falcon Shores, is expected to consume a whopping 1,500W of power per chip, the highest on the market.

Nvidia’s upcoming Blackwell generation boosts power consumption even further, with the B200 consuming up to 1,200W, and the GB200 (which combines two B200 GPUs and one Grace CPU) expected to consume 2,700W. This represents up to a 300% increase in power consumption across one generation of GPUs with AI systems increasing power consumption at a higher rate. SXM allows the GPUs to operate beyond the PCIe bus restrictions, offer higher memory bandwidth, high data throughput and higher speeds for maximal HPC and AI performance, thus drawing more power.

It’s important to note that each subsequent generation is likely to be more power-efficient than the last generation, such as the H100 reportedly boasting 3x better performance-per-watt than the A100, meaning it can deliver more TFLOPS per watt and complete more work for the same power consumption. However, GPUs are becoming more powerful in order to support trillion-plus large language models. The result is that AI requires more power consumption with each future generation of AI acceleration.

Big Tech’s AI Ambitions Lead to Surging GPU Shipments

From Big Tech’s perspective, we’re still in the early stages of this AI capex cycle. Most recently, we covered how Big Tech is boosting capex by more than 35% YoY in 2024, likely upwards of $200 billion to $210 billion, predominantly for AI infrastructure. The majority is flowing to GPU purchases and custom silicon, to power AI training, model development, and to meet elevated demand in the cloud.

2023 was a breakout year for Nvidia’s data center GPUs, with reports placing annual shipments at 3.76 million, for an increase of more than 1.1 million units YoY. A report stated that at peak of 700W and ~61% annual utilization, each GPU would draw 3.74 MWh; this means that Nvidia’s 3.76 million GPU shipments could consume as much 14,384 GWh (14.38 TWh). A separate report estimated that with 3.5 million H100 shipments through 2023 and 2024, that H100 alone could see total power consumption of 13.1 TWh annually.

The 14.4 TWh is equivalent to the annual power needs of more than 1.3 million households in the US. This also does not include AMD, Intel, or any of Big Tech’s custom silicon, nor does it take into account existing GPUs deployed or upcoming Blackwell shipments in 2024 and 2025. As such, the total energy consumption is likely to be far higher by the end of the year as Nvidia’s Blackwell generation comes online in larger quantities.

To read more about Nvidia’s upcoming Blackwell architecture, reference our previous analysis: Nvidia Q1 Earnings Preview: Blackwell and the $200B Data Center. If you own AI stocks, or are looking to own AI stocks and want to learn more, we encourage you to attend our upcoming weekly webinar, held this Thursday at 4:30 pm EST. Learn more here.

A Path to Million GPU Scale

Nvidia and other industry executives have laid out a path for GPU clusters in data centers to scale from the tens of thousands of GPUs per cluster to the hundred-thousand-plus range, even up to the millions of GPUs by 2027 and beyond. We’re already seeing signs of strong demand for Nvidia’s Blackwell platform, but overall, the million-plus GPU data center target is still years away.

Oracle’s Chairman Larry Ellison sees this creating secular tailwinds for data center construction, due to both rising GPU demand and increased power requirements driving a shift to liquid cooling:

“This AI race is going to go on for a long time. It’s not a matter of getting ahead, just simply getting ahead in AI, but you also have to keep your model current. And that’s going to take larger and larger data centers. … The data centers we are building include the power plants and the transmission of the power directly into the data center and liquid cooling. And because these modern data centers are moving from air cooled to liquid cooled, and you have to engineer them from scratch. And that’s what we’ve been doing for some time. And that’s what we’ll continue to do.”

As the industry progresses towards that million-GPU scale, this puts more emphasis on future generations of AI accelerators to focus on power consumption and efficiency while delivering increasing levels of compute. Data centers are expected to adopt liquid cooling technologies to meet the cooling requirements to house these increasingly large GPU clusters.

For more information on investing in AI, check out our 1-hour interview “AI is the Best Opportunity of our Lifetime.”

AI Electricity Demand Forecast to Surge

As a result of booming demand for generative AI and for GPUs, AI’s electricity demand is forecast to surge, especially in the data center. We have a handful of different viewpoints and analyst projections that, while differing slightly in the timelines, all point to that same conclusion.

For example, Morgan Stanley is estimating global data center power use will triple this year, from ~15 TWh in 2023 to ~46 TWh in 2024. This coincides with the ramp of Nvidia’s Blackwell chip later in the year as well as utilization of the entirety of its deployed Hopper GPUs, and increased shipments from AMD and custom silicon ramps from Big Tech.

Morgan Stanley also projects generative AI power demand may exceed 2022’s data center power usage by 2027 if GPU utilization rates are high, at ~90% on average; however, their base case still calls for a nearly 5x increase in power demand over the next three years.

Generative AI Power Demand

Morgan Stanley calls for a nearly 5x increase in generative AI power demand over the next three years in their base case scenario. Source: I/O Fund

Wells Fargo is projecting AI power demand to surge 550% by 2026, from 8 TWh in 2024 to 52 TWh, before rising another 1,150% to 652 TWh by 2030. This is a remarkable 8,050% growth from their 2024 projected level. AI training is expected to drive the bulk of this demand, at 40 TWh in 2026 and 402 TWh by 2030, with inference’s power demand accelerating at the end of the decade. In this model, the 652 TWh projection is more than 16% of the current total electricity demand in the US.

Generative AI Power Demand, AI Training and Inference

Source: I/O Fund

The Electric Power Research Institute forecasts that data centers may see their electricity consumption more than double by 2030, reaching 9% of total electricity demand in the US. The IEA is projecting global electricity demand from AI, data centers and crypto to rise to 800 TWh in 2026 in its base case scenario, a nearly 75% increase from 460 TWh in 2022. The agency’s high case scenario calls for demand to more than double to 1,050 TWh.

Global Electricity Demand from Data Centre, AI and Cryptocurrencies

Source: I/O Fund

Arm’s executives also see data center demand rising significantly: CEO Rene Haas said that without improvements in efficiency, “by the end of the decade, AI data centers could consume as much as 20% to 25% of U.S. power requirements. Today that’s probably 4% or less.” CMO Ami Badani reiterated Haas’ view that that data centers could account for 25% of US power consumption by 2030 based on surging demand for AI chatbots and AI training.

How the Supply Chain is Addressing Power Requirements:

Taiwan Semiconductor is an example of a supply chain company that plays a crucial role here, as its most advanced nodes tout lower power consumption and increased performance, which is why AI accelerators will soon shift from primarily being produced on the 5nm node to the 3nm node and eventually 2nm.

Here’s what we said previously in our free newsletter about TSMC:

At the foundry level, the 3nm process offers 15% better performance than the 5nm process when power level and transistors are equal. TSMC also states the 3nm process can lower power consumption by as much as 30%. The die sizes are also an estimated 42% smaller than the 5nm. …

N3E is the baseline for IP design with 18% increased performance and 34% power reduction, N3P has higher performance and lower power consumption, whereas the N3X will offer high-performance computing with very high performance but with up to 250% power leakage.

The 2nm will be the first node to use gate-all-around field-effect transistors (GAAFETs), which will increase chip density. The GAA nanosheet transistors have channels surrounded by gates on all sides to reduce leakage, yet will also uniquely widen the channels to provide a performance boost. There will be another option to narrow the channels to optimize power cost. The goal is to increase the performance-per-watt to enable higher levels of output and efficiency. The N2 node is expected to be faster while requiring less power with an increase of performance by 10%-15% and lower power consumption of 25%-30%.”

CEO C.C. Wei noted in Q1’s call that TSMC’s “customers are working with TSMC for the next node. Even for the next, next node, they have to move fast because, as I said, the power consumption has to be considered in the AI data center. So the energy-efficient is fairly important. So our 3-nanometer is much better than the 5-nanometer. And again, it will be improved in the 2-nanometer. So all I can say is all my customers are working on this kind of a trend from 4-nanometer to 3 to 2.”

The power problem is being addressed throughout the supply chain, from TSMC’s chip designs to renewable energy power agreements for Big Tech’s data centers. It’ll likely require the industry to move in tandem due to the sheer pace of GPU upgrades from Nvidia, soon AMD and possibly Intel.

We’re covering how another critical part of the supply chain is working to address power consumption this week for our premium members. Learn more here.

Conclusion

AI power demand is forecast to rise at a rapid rate. GPU demand is showing no signs of slowing as Big Tech continues to spend billions on AI infrastructure, with each GPU generation seeing higher peak power consumption. The industry is quickly taking steps to address this, and power consumption, or more specifically, power efficiency per chip, looks to be emerging as the third realm of competition.

We’ve covered the first two realms of competitions, raw computing power and memory, extensively in previous analysis, including “Here’s Why Nvidia will Reach $10 Trillion in Market Cap.” We think it’s important to keep a keen eye on this space as new winners will emerge as AI power consumption becomes mission critical.

Visit us at io-fund.com and sign up for our free newsletter covering tech stocks weekly.

--

--

Beth Kindig

CEO and Lead Tech Analyst for the I/O Fund with cumulative audited results of 141%, beating Ark and other leading active tech funds over four audit periods in 2