Token Pricing May Force a Reality Check on Enterprise AI Costs -- Redmondmag.com

Token Pricing May Force a Reality Check on Enterprise AI Costs

GitHub Copilot's shift to usage-based pricing could signal a broader move away from unlimited AI access as providers and customers confront the economics of large language models.

By Joey D'Antoni
06/29/2026

In December, I wrote a column about the economics of AI, and in talking about how I thought Microsoft was subsidizing GitHub Copilot, I wrote, "If I had to pay the actual costs, I'd be willing to pay $500 a month (I'm being generous here)." Well, starting June 1, Microsoft introduced token-based pricing, meaning that the $15-100 monthly plans for GitHub Copilot just act as monthly credits. This pricing change means that users will now pay for their actual usage against the various AI APIs from Anthropic, OpenAI, Microsoft and others. While GitHub Copilot was one of the first services to adopt usage-based pricing, it most certainly won't be the last (a recent change to Claude Code now highlights actual token usage during sessions, a sure sign that token pricing is coming).

I said in December that I didn't see how this "all you can eat usage" model was sustainable, and I'd like to talk about why. In the Microsoft April 2026 earnings call, CFO Amy Hood said the following: "Gross margin percentage decreased year over year, driven by continued AI investment and increased GitHub Copilot usage, partially offset by ongoing efficiency gains in Azure."

Copilot usage consumed enough compute resources to materially impact Microsoft's margins -- enough for the CFO to call it out.

Tokens are the building blocks of text that LLMs process both as input and as output. Spaces, punctuation, and the amount of context you are using in your interactions with an LLM all contribute to token counts. Roughly speaking, one token is equivalent to four English characters, and one paragraph is roughly equivalent to 100 tokens. Most models price out in terms of 1 million tokens. In GitHub Copilot, each different model has different input and output multipliers. For example, GPT-5-mini (a much older model) has a multiplier of .33, which means each one million tokens will cost roughly $2.00. On the other hand, the popular Claude Opus models have a multiplier of 27 and will cost roughly $25.00 per million tokens. While I feel these numbers are probably much closer to Microsoft's actual cost and fair, it took me several reviews of docs to identify these prices, and the reality is that the pricing is much more complex than these basic calculations suggest.

In my early experiments with AI, I thought GitHub Copilot was a wonderful solution -- low cost, access to a variety of models for experimentation and cheaper than some of its competitors from OpenAI and Anthropic. However, token-based pricing has changed that equation somewhat dramatically. In my private discussions with other MVPs and from the GitHub Copilot subreddit, you can see customers running out of their provisioned token allotment within a few days of the month starting. The consensus is that users are having to limit their usage to much narrower tasks than, say, "document this entire codebase," to manage token usage. The other complaint I've heard from many folks is that GitHub's native tooling for tracking usage is limited. There are some open-source solutions you can add to VSCode (there's an extension I haven't used called Copilot Insights, which MVP colleagues recommended); however, the native billing system seems to lack granularity and timeliness. While I expect everyone to move to a token-based pricing model in the next 12 months, GitHub has put itself on an island with these pricing changes.

There is a lot of industry analysis on what companies spend on information technology (IT). IT spend varies somewhat dramatically by industry, but it's typically between 4 and12 percent of revenue. This spend is typically lower in industries like manufacturing and higher in financial services. I've written a lot in this space about the economics of cloud computing, and I've also consulted with companies, very large and very small, about the economics of cloud computing. One of the only commonalities between a company like Exxon and a small non-profit is that they both hate variable-priced cloud services, especially when usage is unpredictable.

While early AI experiments were very much pie-in-the-sky "let's do anything to make AI work in our company," the economic reality of AI seems to be coming into play on both the AI provider and customer sides. No single study has quantified massive productivity gains at scale from AI solutions. That's not to say it's useless -- I use Claude for coding projects nearly every day, but it's a marginal gain rather than any sort of 10-time performance multiplier. In my experience, even those marginal gains are mostly in IT and development organizations, with other parts of the business seeing much fuzzier results. Not having quantifiable results is fine for a limited pilot project, but as costs continue to increase, businesses may start moving that spend from AI solutions to other projects. Or they may focus on building their own smaller AI solutions in-house to meet specific needs.

There have been other expensive costs in IT orgs -- one only needs to mention the dreaded phrase, "Oracle licensing audit," and IT managers of all types will run screaming. A better comparison is cloud computing, which is frequently cited as being "more expensive" than running your own hardware in your own datacenter. While the sheer costs of cloud can be higher than on-premises (especially for very large orgs), it also delivers value such as high availability, more robust security controls and levels of automation that most smaller orgs never had a chance of implementing without it. This makes for a clear value proposition, unlike AI, which just seems to burn money for making our developers close more pull requests.

Token-based pricing is going to change how many organizations use LLMs. I suspect the days of "tokenmaxxing" and token leaderboards will soon end. I think the trend will be tighter model selection, self-hosted models, and overall token budgets, which will limit AI use. As Anthropic and OpenAI go public, there will be greater pressure on them to become profitable, leading to higher prices. Everyone I've talked to seems unhappy with the GitHub Copilot changes, but I think you will see that throughout the AI ecosystem in the next 12 months.

About the Author

Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.

Featured

Microsoft Expands Defender Experts With New Threat Intelligence and Multicloud Coverage

Microsoft on Wednesday introduced a threat intelligence service and expanded its managed detection and response offering as the company looks to help security teams face growing volume of threat data into specific defensive actions.
What Happens When Malware Outlives its Intended Lifespan, Part 1?

Aging malware can remain dangerous long after its creators move on, leaving victims with fewer protections and no reliable recovery path.
Microsoft, 3M Partnership Targets AI Infrastructure and Enterprise Transformation

Microsoft and 3M on Wednesday announced a wide-ranging partnership that links two major areas of enterprise AI investment: the infrastructure needed to support AI data centers and the use of AI to modernize large organizations.
Microsoft's Record July Patch Tuesday Fixes 570 Flaws, Including Two Exploited Zero-Days

Microsoft's July Patch Tuesday release broke the record for a second straight month, delivering fixes for roughly 570 holes across Windows, SharePoint, Microsoft 365, Azure and others.
Why Most Backup Success Metrics Are Meaningless

Traditional backup metrics can show perfect health while failing to reveal whether critical workloads can actually be restored.

comments powered by Disqus

Subscribe on YouTube

Office 365 Watch

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 3-7, 2026

Virtual Hands-on Training Seminar: PowerShell Mastery Workshop: From Fundamentals to Advanced Automation
September 9-10, 2026

The AI Pivot
September 25, 2026

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
October 6–November 10, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

TechMentor Orlando
November 15-20, 2026

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
December 8-9, 2026

Virtual Hands-on Training Seminar: AI-Powered PowerShell and Infrastructure Automation with Claude Code
December 10-11, 2026

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 9-13, 2027

Webcasts

More Webcasts

Whitepapers

More Tech Library