The trap of AI-assisted software development

Written by Eugene Naydenov | Jul 3, 2026 3:43:31 PM

AI coding tools promise a 30x increase in productivity. The questions follow on their own. If engineers are 30 times faster, why not cut the R&D budget? Or why not stop paying $100,000+ for the software a junior developer could build in a weekend?

This narrative is everywhere in vendor decks and product launches: software has never been faster or cheaper to build.

Many believe the savings translate directly into profit. Some have started planning to build everything in-house, predicting a “SaaS apocalypse” in which expensive third-party software becomes obsolete.

However, this is a costly miscalculation. The 30x promise measures one thing: how much code gets written per hour. The actual cost accumulates everywhere else: in testing, in maintenance, and across the full life of a system in production.

Coding effort: how much can AI replace

The thinking behind the ”why not cut the R&D budget?” falls apart once the production lifecycle is mapped out. Writing code has always been a small part of building a product. Let’s break it down.

Roughly 15% of the effort goes to discovery and design: aligning on product context, defining architecture, and reaching agreement. Implementation, reading existing code, writing new lines, reviewing them account for about 10%: another 15% covers testing, QA (quality assurance), deployment, and early monitoring. The remaining 60% goes to operations and maintenance: bug fixes, security patches, dependency updates, incident response, and user support.

The marketed 30x gains land almost entirely inside that 10% implementation slice. Consider an experienced engineer who spends four hours thinking through a complicated issue and one hour writing the solution. If an AI tool cuts that one hour of writing to five minutes, the four hours of thinking, testing, and validating remain. Total time drops from five hours to just over four. That is a 20% improvement, not the 10x or 30x figures AI coding ads claim.

AI tools help across the lifecycle, summarizing specifications in discovery, drafting tests, and reading logs in maintenance, but the marketed gains sit in one small slice of implementation.
Discovery, testing, and the 60% maintenance load still demand human work, and AI does little to reduce them. Saving minutes on code generation does not reduce the effort of running a live, enterprise-grade system.

Those inflated benchmarks come from isolated, simple tasks. They look nothing like real product engineering.

You might think: “20% of code writing speed is worth reducing”. The question is what survives the rest of the lifecycle, and what it costs when the system in question sets prices or orders inventory for 1000+ retail stores.

If a tool produces code that is more complex, poorly structured, or hard to read, it increases the maintenance load. More maintenance means more engineering hours, higher running costs, and a lower return on the software over its lifetime. The hour saved writing code is repaid many times over across the years that the code runs in production.

More code doesn’t mean more profit

Writing more code does not mean delivering more. Teams adopting AI coding tools generate far more of it, and almost none produce results any faster. The constraint shifts from how fast code is written to how fast it can be safely reviewed and shipped.

LinearB’s 2026 Software Engineering Benchmarks Report, drawn from 8.1 million pull requests across 4,800 engineering teams, shows where the time goes. AI-generated code submissions run 2.5x larger than human-written ones. Reviewers do not trust large blocks of code they did not author, so these submissions sit in the queue 4.6 times longer before review begins.

When review does happen, it moves twice as fast, and only 32.7% of that code clears it without rework, against 84.4% for human-written code. The organizational result matters most. By LinearB’s own reading, all that extra code is not translating into greater value delivered. The promised 10x or 30x never creates a business outcome.

Producing code at high volume means little when that code brings architectural drift, security gaps, and a growing review backlog. The speed gained at the keyboard is spent again on reverse-engineering work no one trusts.

The errors you find in the P&L

Across retail systems, pricing software is one of the few where mistakes become customer-facing immediately. A bug directly influences the price, and once an incorrect price reaches end customers, orders are already placed before the issue is detected. By then, the financial impact is already reflected in the P&L and loyalty damage in the customer base. It cannot be reversed.

That outcome has a human cause. When the volume of generated code outpaces a team’s capacity to absorb it, verification stops. The output looks reasonable, a basic test passes, and the code is approved without anyone building an independent understanding of it.

A Wharton School study by Steven Shaw and Gideon Nave, which named this effect cognitive surrender, found that people lean on AI answers even when those answers are wrong, and grow more confident in them rather than less.

In software, that drift accumulates as comprehension debt. The people who maintain the codebase gradually stop understanding how it behaves.

When the team that maintains a pricing system can no longer explain how it works, the problems do not surface early. They surface at the worst moment, a live failure no one on the team can debug.
The test before every platform update goes live is simple: can someone on the team explain this code without AI?

Building pricing software responsibly

The pull to ship features fast is understandable in a competitive market. However, long-term stability cannot be traded for short-term wins.

Competera has been building pricing software since 2017. From the start, the Competera Pricing Platform has set retail prices in an environment where a single calculation error can cost a retailer seven figures. That bar for predictability and safety predates the current wave of generative AI.
When our engineering team adopted agentic coding tools, it carried the same bar into its own workflow. AI is treated as an assistant, not an autonomous authority. A human engineer validates every decision before any code is generated, and human review and full ownership of the result stay non-negotiable.

The trade-off is deliberate. Comprehension is valued over raw speed, even when that means an engineer works for days before changing a few lines. The downstream operations and maintenance described above do not disappear, and the generated volume is not allowed to mask the underlying architectural questions.

"This approach does not give us a mythical '30x' speedup, but it does give us a safe, predictable, and sustainable velocity boost without compromising the quality of our product,"
says Eugene Naydenov, Chief Technology Officer at Competera.

Build, buy, or generate?

The classic build-or-buy decision now has a third option: generate it with AI. Some leaders believe that generated software is cheap enough that buying enterprise software is no longer necessary, and they predict a "SaaS apocalypse" in which internal teams spin up custom clones of whatever they need.

This is a mirage. Enterprise SaaS platforms are not expensive because of the initial code it took to write them. They are expensive because of their underlying data models, system integrations, complex compliance frameworks, and the continuous operational burden they manage. More importantly, their true value lies in the highly refined, ergonomic user experiences and industry-standard workflows that have been polished over the years to match the mental models of the pricing managers who use them.

Building an in-house pricing platform means inheriting its long-term maintenance, security, and integration costs, not just the initial build.

The fundamentals of the decision have not changed: buy what is not your core, and build your core responsibly. When building is the right call, it is done with disciplined AI pair-programming, human engineers in control, clean architecture, and comprehension debt paid down the moment it appears.

A pragmatic pace beats a frantic one. Teams that keep ownership of what they build, and buy the rest, end up faster and safer than competitors chasing a 30x building speed that never existed.

View full post