Vibe Coding vs. System Stability: Why Fast AI Code Fails

December 22, 2025

By Hubert Brychczynski

Vibe Coding,
AI,
AI Software Development,
AI Engineering,
System Stability,
Developer Productivity

Collins Dictionary chose "vibe coding" as the word of the year, but the shine might be off the apple for this alluring and, apparently, insidious trend. Yes, it might be true that you can vibe code an entire application in a matter of hours, but what difference does it make if the app crashes the minute after launch and no one can understand why?

If you're worried about the competition using vibe coding to ship software faster than you, or better still if you played (or considered playing) with vibe coding yourself in development and production, this article is for you. We'll examine the current state of vibe coding in the industry and discuss cautionary tales that put vibe coding in a different light than its advocates would prefer.

What Vibe Coding Promised

The term "vibe coding" comes from OpenAI co-founder Andrej Karpathy. He suggested AI would free developers of the usual toil of coding, allowing them to spin a working application almost out of thin air, on the strength of a few prompts, without debugging or even understanding what's going on. Sounds too good to be true? Maybe it's because it is.

Vibe Coding Adoption is Declining

After an initial spike this summer, usage of vibe coding tools has plummeted. Vercel's v0 traffic is down by 64%, Lovable suffered a 40% decline, and Bolt.new went down by 27% since June, Business Insider reports. As users leave in droves, some commentators, like Bolt.new's CEO Eric Simons, suggests the solution is to make the platforms more attractive. Here's a thought, though: maybe vibe coding is losing traction not because it needs some bells and whistles, but because it just doesn't deliver on its promise?

Vibe Coding is Not Really Fast

METR is a model evaluation and threat research organization that has helped OpenAI and Anthropic vet their models before release. Recently, METR decided to test the claims of AI-boosted developer productivity. The research found a staggering discrepancy between perceived and factual impact of AI on software engineering: while developers self-reported to be around 20% more productive with AI, objective metrics painted a starkly different picture: an average of 19% decrease in developer productivity with the use of AI.

If you feel like challenging the results of the study, you’re not alone. Software developer Mike Judge conducted a similar experiment on himself and reported it on his Substack. He replicated the METR study’s results almost exactly: AI has reduced his productivity by approximately 19%.

Judge also asked a thought-provoking question: if vibe coding platforms offer as much of a boost to programming as their providers claim, is it reflected in statistics? To find out, he looked up historical data on iOS and Android apps, plus curated his own chart on GitHub repositories from GH Archive.

Judge makes what seems like a reasonable assumption: sooner or later, vibe coding should cause a spike in the number of contributions to software. Unfortunately, the data he cites doesn’t support the assumption. The charts are mostly flat: see Figures 1, 2, and 3 below.

Fig. 1: Average number of new Android app releases per month from March 2019 to April 2025, via Statista.

Fig. 2: Average number of new iOS app releases per month from March 2019 to April 2025, via Statista.

Fig. 3: New public GitHub repositories from January 2023 to July 2025, based on data from GH Archive; created by Mike Judge (source).

Vibe coding is too unpredictable for the enterprise

Suppose you rub a magic lamp and a genie appears. This time, the genie doesn't come with a wish quota and has a particular specialization: it can create whatever you want on the spot, and the thing you wish to must have existed before. Ask it to build a house from Home Alone, cook a meal from a Gordon Ramsey video, or assemble a Volvo. Voila: they all materialize right before your eyes. It's a miracle.

But there's a catch. Occasionally, something explodes. This happens totally at random regularly. Plus, you never know the time, the size, and the impact of the explosion, and you can't investigate the causes.

This is roughly what it means to adopt vibe coding. Here are just a couple of such "explosions" from tech reporting over the last months:

The Enrichlead security debacle

Leonel Acevedo built his startup, Enrichlead, using Cursor AI without writing any code. The application appeared polished, handled user signups smoothly, and seemed to function flawlessly. Acevedo saw himself as living proof that vibe coding could replace traditional development.

Within days of launch, Acevedo found himself posting frantically online about being under attack. API keys maxed out, users bypassed subscriptions, and random garbage flooded his database.

The AI-created app was riddled with embarrassingly fundamental vulnerabilities: no real authentication system meant users could skip the paywall entirely; missing rate limiting left his API wide open to spam attacks; absent input validation allowed anyone to dump garbage into his database.

The Gemini CLI hallucinated catastrophe

Product manager Anuraag Gupta attempted what should have been a trivial operation: asking Google's Gemini CLI to move his project files into a new folder. The operation appeared to have completed successfully. But when Gupta couldn't immediately locate the new folder in his file manager, he asked Gemini to help find it. That's when things went sideways.

Due to sandbox restrictions, Gemini couldn't search outside its project directory. Unable to verify the files' location, the AI jumped to the worst possible conclusion: it had destroyed everything. What followed was a spiral of increasingly dramatic apologies. Gemini declared that it had "failed completely and catastrophically" and refused to attempt any further operations, asserting that it could not be trusted.

Ultimately, the files were recovered, but Gemini's system failure sent Gupta on a frantic search for data that had never been lost, wasting time and causing unnecessary stress. He noted Gemini should have "failed gracefully” and not “hallucinate 'losing' the files and causing an alarm."

Google Antigravity wipes user's entire drive

In late November 2025, a user of Google's new Antigravity vibe coding platform discovered the hard way what happens when AI agents operate without guardrails.

The user, a photographer and graphic designer from Greece, was using Antigravity to build simple software for rating and sorting images. He had been running Antigravity in Turbo mode, which allows the AI agent to execute commands without user confirmation.

After a while, the user realized Antigravity had wiped out his drive. When confronted about it, the chatbot stated that a command to clear the project cache had incorrectly targeted the root of his D: drive rather than the specific project folder. The AI expressed being "deeply, deeply sorry" and described the incident as "a critical failure."

Many Antigravity users have reported similar experiences with the platform deleting or corrupting files without permission, and other vibe coding platforms, such as Replit, have faced comparable incidents, including deleting a customer's entire production database and then attempting to cover up the incident.

Using AI For Coding The Right Way

If you're in the software engineering business, your success hangs on the reliability of your service. Vibe coding throws the reliability out the window, leaving you at the whim of a tokenizer.

The good news is that there is a way to leverage AI in software development properly. AI will work wonders in the hands of experts who can scrutinize its output.

Modern IDEs with AI assistance genuinely help developers be more productive. For example, they can suggest fine-grained completions that accelerate work without the reliability nightmares of generating entire applications.

AI can generate code to specific requirements and modernize existing implementations if the original intent is clear from context. As a general rule, the clearer the documentation and the better the context, the easier it is to use AI to extend a project incrementally.

Vibe coding works fine for quick prototypes and personal projects. If something breaks, you shrug off and try again. The problems start when people try to ship vibe-coded applications to real users or deploy them in production environments.

Janea Systems' Proven Track Record of AI-Assisted Coding

At Janea Systems, we've conducted a few extensive experiments to determine exactly when and how AI tools genuinely help software development. Our research compared ChatGPT and GitHub Copilot across a wide range of programming tasks, from simple algorithms to complex refactoring, with engineers rating accuracy, time savings, and overall experience.

The results were revealing: both tools achieved 94% accuracy on assigned tasks, with GitHub Copilot excelling at simple algorithms and ChatGPT performing particularly well on text-based tasks such as documentation and scripts. For straightforward challenges, AI-generated solutions often required no corrections at all.

When we measured real-world performance across different engineering domains, we found that engineers using AI completed tasks an average of 30% faster, although there are caveats. Front-end and back-end engineers saw the most dramatic improvements, at 67% and 56% respectively. Machine learning engineers saw a 24% increase, data engineers a 10% increase, and DevOps engineers a 5% decrease.

The pattern was clear: domain expertise, tool proficiency, and skill in prompt engineering all correlated with better outcomes. Engineers who understood their field deeply could evaluate AI recommendations critically, identify errors, and steer the tools toward useful outputs.

Our conclusion aligns with what the industry is slowly learning: AI coding tools are most effective as advanced assistants for experienced developers, not as replacements for engineering knowledge. They accelerate work by generating boilerplate, referencing documentation, or serving as sounding boards for ideas. But they require continuous human oversight, and the code they produce often needs substantial refinement before it's production-ready.

Partner with Janea Systems

Ready to leverage AI the right way? Janea Systems combines deep engineering expertise with practical AI integration to deliver software that actually works. Our teams have optimized deep learning pipelines for Microsoft Bing Maps, enabled PyTorch support across new architectures, and built AI-powered tools used by millions. Contact us to discuss how we can bring that same rigor to your next project.

Frequently Asked Questions

Vibe coding is the practice of letting AI generate most or all of an application from loose prompts, with minimal understanding or review of the code. It feels fast, but in production, it often leads to hidden security holes, fragile architectures, and failures that are hard to debug or even reproduce.

Studies and our own experiments show that the “let AI build everything” approach can reduce overall productivity, even when developers report feeling faster. The real gains (approximately 30% on average in our data) occur when experienced engineers use AI for targeted assistance, such as boilerplate generation, refactoring, and documentation, while maintaining control over design and review.

Treat AI as a powerful assistant, not an autopilot. Keep humans responsible for architecture, security, and code review; use AI for small, well-scoped tasks inside a clear, well-documented codebase. Reserve full “vibe-coded” builds for prototypes and personal experiments, not production systems that real users depend on.