AI's Billion-Dollar Problem: Why Today's Investments May Never Pay Off -- Redmondmag.com

AI's Billion-Dollar Problem: Why Today's Investments May Never Pay Off

As companies pour unprecedented money into AI, soaring compute costs, limited model differentiation and an unsustainable business model are raising questions about whether the industry's massive bets can ever deliver real returns.

By Joey D'Antoni
12/10/2025

Since the 2022 launch of ChatGPT, AI has been the singular focus of the larger technology industry and of many "regular" companies that have turned their IT investments into AI investments.

At Microsoft alone, in 2025, we saw a pledge to spend $80 billion on AI investments, which was one of the primary reasons for many of the layoffs we saw this year in Redmond and beyond. Gartner said in a September report that worldwide AI spending in 2025 will top $1.5 trillion (that wasn't a typo). While I feel as though that number may be an exaggeration, undoubtedly, we have seen massive investments, and the speculation that the industry at large is a bubble, much like the early dotcom era (though this one is very different, more on that later).

As a technology professional myself, I rate myself somewhere in the middle of the AI belief scale. LLMs and other AI tools will remain part of our development toolbox forever. In my current project, I've used vector search to provide a more robust search experience and an LLM to summarize text from web-scraped screenshots. I've also used GitHub Copilot and other LLMs to help with coding and troubleshooting. Here's where I differ from the true believers. While those LLMs are helpful, the computational costs, environmental impact and potential societal impact are not necessarily worth the benefits. Even bigger than that, the investment firms like Microsoft are making will likely not pay off in the end. In this column, and in subsequent ones, we'll look at each of these problems in detail. I wanted to first focus on the business model.

There are two major compute cost components for running an LLM: training and inference (taking a user's input, in the form of a prompt, and processing it using the language model's parameters to generate output such as text, code or translations). We'll dig into each of these independently. If you're using an LLM today and you aren't paying a lot of money for it, the investors of the company running your AI model are subsidizing your usage. You've also probably heard the word "datacenter" more in the last six months than in your lifetime -- and you might wonder why LLMs need so much compute. Even during the buildup of the massive cloud data centers that Microsoft and AWS rolled out, you didn't hear about this same need for power and data center spaces. Unlike a traditional software application, an LLM like ChatGPT must perform billions of tensor operations every time it generates a response to a user. Serving millions of user queries thus translates to a massive cumulative energy draw. It's essential to understand this -- even further advancements in LLM technology likely won't reduce the compute required for inference. There isn't a notion of cache, where you should persist pre-calculated results, because every query is unique. There's a chart in this Financial Times article that shows that as OpenAI has added users, its inference costs have gone up exponentially.

The training costs for commercial LLMs like Claude and ChatGPT are hard to pin down exactly. Reports range from 100s of millions of dollars to as high as $2.5 billion (that number was from an HSBC report in June of 2024). Each new release may require multiple training cycles, each with very high costs for the thousands or even millions of expensive GPU cores required. Beyond the sheer computational costs, there are also costs associated with preparing and acquiring data, as well as pre- and post-training tasks. To achieve any level of differentiation and, therefore, significant price premiums, extensive training is required.

Given all those training costs, there isn't much differentiation between models. Calm down, AI true believers -- the first question you get asked by any die-hard AI fan when you complain about something an LLM does is, "Did you use the latest model that came out approximately 3.43 minutes ago?" When you answer "no," the response is usually something to the effect of "pfft, amateur." Strong Linux, 1999 vibes. In testing I've done and in my usage, sure, Anthropic's Claude models are better at tasks related to writing code. Still, ChatGPT and even Ollama can do an ok job. For my vectorization and summarization tasks, I'm running my own small instance of Ollama and the cheapest AWS Nova model, respectively. In both cases, they meet the project's requirements. Could a newer model with more parameters deliver slightly better results? It's not worth the difference to even test.

This lack of differentiation between models is why I struggle to see any of the AI firms hitting it big. In the 1980s, when Oracle took a strong foothold in the relational database market, it had a clear product advantage. While SQL Server came out in 1989, Microsoft didn't get on an even playing field with Oracle until, at best, 2005. There's no clear leader in this AI race -- no one model does all the things well. The worst business part of this equation is that if one model improves, the others need to spend more money on training to catch up and gain a brief advantage. The only moat the AI models have against each other is the cost of hardware required for training. Additionally, outside concerns with novel approaches like Deep Seek are always a threat.

Earlier, I compared the dotcom bubble to the current situation in the AI market. However, there are several significant differences: the artifacts of the dotcom bubble bursting included massive fiber-optic networks (which don't degrade rapidly over time), considerable evolution in edge networking equipment and development skills, which helped power the actual growth of the internet and e-commerce. The only artifact of the AI bubble is the massive amount of energy spent on data centers and GPUs that will likely be outdated within two to three years at best. Additionally, we are also degrading the skills of our next generation of developers, but that's a story for another column.

To bring this back to the economic story, I think about this in a personal context. What would I pay for an LLM to help me with my job? I currently use GitHub Copilot (disclosure: I get GitHub Copilot via sponsorship from my Microsoft MVP award). Generally speaking, it enhances my development productivity by somewhere between 10 and 20 percent. You could argue those numbers, but Copilot screws up a lot, and that's a lot of time I wasted when I could have just coded something manually. The Pro SKU I use costs $10/month. I suspect that, at my current usage, Microsoft is losing a lot of money on me. That supposition is based on the costs of a similar model API user at Microsoft Foundry.

If I had to pay the actual costs, I'd be willing to pay $500/month (I'm being generous here). JetBrains says the number of professional developers worldwide is approximately 20 million. If every developer in the world were paying $500/month (surely it's easy to sell a $500/month service to every developer in the world, right?), that would be $120 billion a year. Sounds promising.

Remember, all the available evidence suggests that our inferencing costs will increase faster than our revenue, especially as you scale users. Which means that to serve all those users, you need even more expensive compute infrastructure, and you're making less money per user. This math is the crux of the AI economics problem. The AI vendors could fix this by charging actual costs for the use of AI APIs and tokens, however, that probably wouldn't allow for 20 million customers.

About the Author

Joseph D'Antoni is an Architect and SQL Server MVP with over two decades of experience working in both Fortune 500 and smaller firms. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. He is a Microsoft Data Platform MVP and VMware vExpert. He is a frequent speaker at PASS Summit, Ignite, Code Camps, and SQL Saturday events around the world.

Featured

Why Azure SQL Database Hyperscale Is Not Just for Massive Workloads

Hyperscale combines strong write performance, flexible storage and fast replica creation for databases of nearly any size.
HOLLOWGRAPH Malware Turns Microsoft 365 Calendars Into Covert Attack Channels

The targeted espionage tool hides commands and stolen files inside calendar events while using legitimate Microsoft cloud traffic to evade detection.
Enterprise AI Agents Outpace the Content and Governance Systems Behind Them

AI agents have quickly moved into mainstream enterprise use, but the content infrastructure needed to support them has struggled to keep up, according to a new survey-based report from cloud content management company Box.
Phishing Isn't an Email Problem Anymore - It's an Identity Problem

Security teams have invested heavily in email protection, endpoint security and identity controls, but Fortra's latest research suggests one challenge remains difficult to solve: users.
Microsoft Expands Defender Experts With New Threat Intelligence and Multicloud Coverage

Microsoft on Wednesday introduced a threat intelligence service and expanded its managed detection and response offering as the company looks to help security teams face growing volume of threat data into specific defensive actions.