A CFO’s guide to the expected pricing shift in AI and a short checklist to help you prepare
We’ve been living through a remarkable period of cognitive arbitrage. For the past few years the major AI engines have served up tokens at prices that don’t reflect the true cost of the hardware running underneath them. The tech giants have leaned on enormous balance sheets, venture capital and corporate debt to capture share.All the hallmarks of classic land-grab behaviour.
As we move through 2026, the sentiment in Australian boardrooms is quietly shifting from starry-eyed excitement to financial caution. The bills are landing, and they’re heavy. Leaders are noticing a rapid rise in AI costs. Tension is building between IT teams eager to innovate and finance teams staring at fluctuating operating expenses.
The question finance leaders should be asking isn’t whether AI is worth it. It’s who ends up paying for the infrastructure being built to deliver it, and whether the way you’re charged today will survive contact with the economics underneath.
CFO’s need to be on top of where the costs really sit, why they keep compounding, and how to position and be ready in case the pricing model shifts under you.
And the data backs up the mood. In Writer’s 2026 Enterprise AI report, 79% of organisations now report significant challenges in adopting AI, a double-digit jump on the prior year. Nearly half of executives (48%) describe their AI adoption as a “massive disappointment”, and only 29% say they’re seeing significant return on generative AI. The gap isn’t between AI working and not working. It’s between individual productivity gains and organisational return.
Gartner’s analysts have been blunt about why. Traditional cloud software costs are predictable. AI costs are dynamic, dictated in real time by how complex a prompt is, how many tokens a model produces, and how often customers hit AI-powered features. It’s an operating-budget nightmare. Gartner now predicts that more than 40% of agentic AI projects will be cancelled by the end of 2027, citing escalating costs, unclear business value and inadequate risk controls. Whilst cost is only one of three drivers, it’s the one finance leaders feel first.
The supplier side explains the pressure
To understand where pricing is heading, we can follow the money on the supply side.
Consensus estimates now put capital expenditure by the largest cloud and AI infrastructure providers at roughly $650 billion in 2026 alone, with Goldman Sachs projecting hyperscaler capex of around $1.15 trillion across 2025 to 2027. That’s more than double the prior three-year period.
Will that spend pay off? In a widely cited 2024 analysis, Sequoia’s David Cahn calculated that the AI ecosystem needed about $600 billion in annual revenue to justify the infrastructure being built, against actual AI revenue closer to $100 billion. On Cahn’s own method, which doubles Nvidia’s run-rate revenue to capture full data-centre cost, the 2026 number would be larger still. The gap hasn’t closed. If anything it looks as if it’s widened.
Even the market leader shows the strain. Sacra estimates OpenAI hit a $25 billion annualised revenue run rate in February 2026. Not bad. On the other side of the ledger, independent forecasters put its 2026 GAAP loss at $25 to $26 billion, well above the $14 billion non-GAAP figure that dominates headlines. Inference costs alone are projected to reach $14.1 billion this year.
Here’s the part I think is worth paying close attention to, because it’s where most commentary goes wrong. OpenAI runs at roughly a 33% gross margin. A positive gross margin means inference, on a blended basis, is priced above its incremental compute cost. They are not losing money on every prompt you run. The losses sit below that line, in training, research, stock compensation and the relentless build-out of capacity. The subsidy isn’t in the prompt. It’s in the $650 billion arms race. And as the next two sections show, that build-out isn’t a one-off cost that ends when the data centres switch on.
The bill that doesn't stop: the replacement treadmill
The headline capex numbers create a comforting illusion. This is not a one-time investment after which the meter stops running for two reasons.
The first is depreciation, and it’s the live controversy right now. By 2023 the major hyperscalers had quietly extended the assumed useful life of their servers from three or four years to six, a change that collectively trimmed an estimated $18 billion a year from reported depreciation expense and flattered earnings accordingly. The problem is that Nvidia ships a new architecture roughly every 18 to 24 months, and its flagship parts run on a three-year cycle. Microsoft’s own annual filing concedes its computer equipment lasts somewhere between two and six years. Investor Michael Burry, who called the last financial crisis, has gone further, arguing the real economic life of these GPUs is closer to two or three years and that the industry is understating depreciation by something like $176 billion across 2026 to 2028. Reporting bears out the direction: high-end accelerators can lose more than half their resale value by year three.
What does that mean for the true cost of the build-out? Epoch AI’s component-level analysis of a one-gigawatt AI data centre is the clearest illustration. The up-front capex is about $38 billion, but annualised across each asset’s lifespan the real total cost of ownership is roughly $8.5 billion a year, of which servers alone account for around $5 billion, or 60%. Crucially, that assumes a five-year IT lifespan. Shorten it to three years, which is closer to what the critics argue, and the annual cost climbs to about $12 billion. The most expensive component in the entire stack is also the one that has to be torn out and replaced most often. The build-out is less a purchase than a subscription that can’t be cancelled.
The second reason is operating cost. The same Epoch analysis puts annual operating expenditure for that one-gigawatt facility near $0.9 billion, covering power, cooling, maintenance and staff. Power and cooling dominate, and the load only rises as rack densities climb past 100kW and liquid cooling becomes mandatory. Industry rule-of-thumb work suggests cumulative operating expenditure can exceed a facility’s original capital cost within five to seven years. Switching the lights on is where the recurring bill starts, not where it ends.
For a provider carrying both of these, the temptation to recover the cost through pricing power, rather than through the cents-per-token they currently charge, is obvious.
And the costs regulators are about to add
If the replacement treadmill and the energy bill weren’t enough, there’s a third cost vector now forming, and it’s the one most likely to be passed straight through to customers: regulation and the environmental backlash driving it.
In the United States the shift has been rapid. Lawmakers across more than 30 states introduced over 300 data-centre-related bills so far in 2026 alone. These span moratoriums, tariffs and energy policy, with at least 11 states weighing temporary construction bans while they study grid and water impacts. More than 230 environmental organisations have called for a national moratorium. The mechanisms under discussion are exactly the kind that raise the cost of compute: special tariffs for large loads, cost-shifting rules that force data centres rather than households to fund grid upgrades, and water fees. The proposed federal GRID Act’s off-grid power mandate alone is estimated to add between $500 million and $2 billion in upfront cost per hyperscale facility. The pressure isn’t hypothetical, either: PJM, the largest US grid operator, attributed part of a 76% wholesale price increase in the first quarter of 2026 to data-centre load.
Australia is on the same trajectory. It’s just earlier in the cycle. AEMO estimates Australian data centres consumed about 3.9 TWh in FY25 (roughly 2% of National Electricity Market demand) and forecasts that growing at around 25% a year to 12 TWh by FY30. Transgrid has warned that without new policy, the cost of upgrading the grid to serve power-hungry data centres could be passed on to existing household and business customers. In March 2026 the Australian Energy Market Commission released a draft rule proposing tougher technical standards for large data-centre connections to the grid, prompted in part by overseas incidents where dozens of facilities dropped off the grid simultaneously during a fault. The development pipeline is enormous, 44 facilities totalling 11.4 GW in New South Wales alone as at March 2026, and the regulatory and ratepayer scrutiny is scaling with it.
None of this stays with the operators. Tariffs, levies, grid-connection costs and compliance overhead all flow into the cost base of the firms selling you tokens. It’s one more reason the people running these platforms will be looking hard at how they price.
Why all roads appear to be leading to value-based pricing
Put the three together, an infrastructure arms race that has to be re-bought every few years, an operating bill that compounds, and a regulatory cost layer just beginning to land, and you can see why analysts increasingly suspect providers will shift from cost-based pricing (charging for tokens) to value-based pricing (charging for outcomes). If an AI agent saves your firm a million dollars in labour, the provider won’t be satisfied charging you two dollars for the tokens. They’ll want a share of the savings to help plug a multi-billion-dollar hole.
Nothing here is set in stone, and the smartest move isn’t panic. It’s positioning. Build resilient frameworks now to give you more control, before the pricing model shifts under you.
Regaining cost control
You don’t need to pull the plug on your automated workflows. You need to design your systems to treat models as commodities rather than permanent partners, and to keep cost visibility and optionality on your side of the table. Two practical examples show how.
Disclaimer: These two examples we have mentioned are to illustrate the point because we have experience with them. There are more and you should do your own research.
A spend gateway: Cloudflare’s AI Gateway
Instead of letting developers wire directly into an expensive external API, where a rogue piece of code can burn the budget over a weekend, route traffic through an abstraction layer. Cloudflare’s AI Gateway acts as a financial buffer. It tracks real-time input and output token costs and lets you set hard, dollar-based spend limits, and it can fall back to a cheaper or self-hosted model when an application hits its ceiling or for tasks that simply don’t require that level of grunt. Pair that with semantic caching, so you’re not paying twice for the same prompt, and you have a real defence against cost spikes.
This is where genuinely hybrid architecture earns its keep. Lenovo’s 2026 total-cost-of-ownership research (n.b. This is vendor research with a commercial interest in the conclusion) claims that for high-utilisation pipelines, owning the infrastructure can deliver up to an 18x cost advantage per million tokens against frontier model-as-a-service APIs (and a more conservative 8x against cloud IaaS). The down-routing logic above is how you capture that saving without abandoning the public cloud entirely.
Predictable per-seat licensing: Microsoft 365 E7
On the internal-operations side, for organisations heavily embedded with the Microsoft Environment, Microsoft’s M365 E7 “Frontier Suite” is a useful study in budget predictability. At a flat USD99 per user per month (Annual billing) it bundles E5 security, Copilot and the new Agent 365 governance layer into a single, forecastable line item, you setup and manage agents almost as if they are employees with the similar visibility and controls plus it’s model-diverse by design, running Anthropic’s Claude alongside OpenAI’s models so you’re not locked to one model vendor inside the suite.
You need to be open-eyed about what E7 is and isn’t, though. It buys you per-seat predictability for daily office work, which is real and valuable. It does not give you the local-plus-cloud hybrid architecture described in the Cloudflare example as it’s a cloud SaaS bundle. And it deepens your dependence on a single platform vendor, the very concentration risk the rest of this strategy is designed to reduce. Reports before launch even suggested E7 might carry consumption-based pricing before Microsoft confirmed it wouldn’t, a reminder that today’s fixed price is a commercial decision, not a law of nature. Use E7 for predictability with your eyes open to the lock-in it creates, however for many organisations already embedded within the Microsoft ecosystem, it has to be on your evaluation list.
The CFO's strategic AI checklist
Finance leaders need to evaluate AI investments for the world three years out, not just today. A few questions to ask before signing off the next deployment:
- Are we building on an abstraction layer? Make sure developers aren’t hard-coding a single vendor’s API into your stack. If a provider hikes rates or shifts to value-based pricing, you want to swap engines without rewriting your codebase.
- How are we managing token waste? Semantic caching through a gateway like Cloudflare stops you paying full price for identical or repetitive queries. Paying twice for the same prompt is the modern equivalent of leaving the lights on all weekend.
- What’s our protection against a pricing-model switch? Review contracts for any signal of a move from per-token cost to a percentage of “business value generated” or a success fee, and make sure you hold an exit or renegotiation clause.
- Can a Small Language Model do this job? Not every task needs a frontier model. Route simpler, high-volume work to smaller specialised models, ideally on infrastructure you control, to cut variable cost.
- Does our licensing match our usage density? Compare flat per-seat tiers like E7 against your current ad-hoc token spend. If staff are heavy users, a structured tier can save real money, provided you’ve priced in the lock-in.
- What’s the true switching cost if we leave? Before you embed a tool in daily operations, calculate the friction of moving off it. A tool that makes the team 50% faster today but carries dramatic price-hike risk tomorrow needs a migration plan attached.
So, who pays when the bills land? Today, the providers are still absorbing much of it to hold your business. That won’t last, because the underlying economics were never sustainable at the prices we’ve enjoyed. The firms that come through this well won’t be the ones that adopted fastest. They’ll be the ones that kept optionality, watched the total cost of ownership rather than the sticker price, and built the discipline to treat AI as a commodity input before their providers move the cost back onto the customer.