IT Failures

Vibe Coding Will Kill Someone

June 10, 2026·7 min read
Abstract illustration representing vibe coding — AI-assisted software development accepted without deep understanding, a practice that poses serious risks in critical production systems.

Studio Danro / Noun Project / CC BY 3.0

In February 2025, Andrej Karpathy — former Tesla AI director, former OpenAI founding member — posted a description of how he now writes software. He called it vibe coding. The idea: describe what you want in natural language, let the AI generate the code, accept it without reading it too carefully, and move on. "I'm building something," he wrote, "but it's not really coding."

For a personal project or a landing page, this is fine. Nobody dies if a side project has a race condition.

The problem is what happens when this mindset leaves the prototype and enters production. When it enters systems that process financial transactions, manage medical records, or control infrastructure. When the person who deployed the code doesn't understand what it does — and nobody reviewed it closely enough to find out.

We've seen this before. The disasters documented on this blog follow a pattern: someone made a decision about a system they didn't fully understand. Vibe coding doesn't create a new failure mode. It industrialises an existing one.

Jump to FAQ ↓


What Vibe Coding Actually Is

Karpathy's description was honest. He wasn't advocating for vibe coding in production systems — he was describing a personal workflow for experimentation. But the term landed in a community already primed to hear it as permission.

In 2025 and 2026, vibe coding became a movement. Developers — many of them new to the profession — began shipping production code they couldn't explain. Not because they were lazy, but because the tooling made it easy, the culture rewarded speed, and nobody told them why that was dangerous.

The code works in the demo. It passes a basic smoke test. It gets deployed.

Three weeks later, something breaks in a way the developer doesn't understand, because they never understood the system in the first place.


What Critical Systems Actually Require

I spent years working on payment infrastructure at SIBS, the organisation behind MBWay — a system that processes millions of transactions across Portugal. Before you touched a single line of code in that environment, you needed to understand the business domain deeply. Not superficially. Not "I read the documentation." You needed to know where you were operating, what the data represented, what a failure at that point in the pipeline would cost — in money, in trust, in regulatory consequences.

The codebase was complex in a way that commanded respect. Method signatures with generics that took real thought to follow. Classes designed with intention, not convenience. Every Optional, every Stream, every carefully placed lambda existed because someone had reasoned about the failure mode it was preventing. It was genuinely beautiful engineering — not in an academic sense, but in the sense that it had survived contact with reality and held.

You didn't deliver code in that environment without unit tests that actually tested something. Not coverage metrics — tests that would catch the edge cases that would surface at 3am on a Saturday when transaction volume spiked. Exception handling wasn't defensive boilerplate. It was the result of someone thinking carefully about every way the operation could fail and deciding, deliberately, what the system should do in each case.

AI does not write code like this. I have never seen it. I don't believe it can, not because of some fundamental limitation of the technology, but because writing code like this requires understanding the domain, the history of the system, the business rules that aren't written down anywhere, and the consequences of being wrong. The AI doesn't have that context. It has tokens.


Code Review Is Shared Responsibility

In critical systems, code review is not a formality. When you approve someone's code, you are taking responsibility for it. You are saying: I understand what this does, I believe it is correct, and I am willing to own the consequences if it isn't.

That changes how you review. You don't skim. You ask questions. You push back. You don't approve code you don't understand, because approving code you don't understand means the next incident is partly yours.

Vibe coding breaks this contract at the source. If the developer who wrote the code doesn't understand it, the reviewer is being asked to take responsibility for something nobody understands. The accountability chain doesn't just weaken — it disappears.

The Knight Capital failure in 2012 is the clearest example of what happens when this chain breaks. A developer deployed code without fully understanding the system state it would encounter. The reviewer approved it. Forty-five minutes later, $440 million was gone. Nobody set out to cause that failure. Everyone was just moving fast and trusting that the system was understood. It wasn't.


The Pattern We Keep Repeating

The Therac-25 killed six people. A single developer, no independent review, software that had never been tested at the edge cases that mattered. The assumption that the system worked because it had worked before.

The Ariane 5 destroyed itself sixty-four seconds after launch. Reused code from a different rocket, a different flight profile, an integer overflow that nobody caught because the assumption was that the old code was safe.

The Boeing 737 MAX killed 346 people. Engineers who didn't question a system they didn't fully understand. A sensor. A single point of failure. An assumption that the software would handle it.

The pattern is not malice. It's not incompetence in the simple sense. It's the assumption that the system is understood when it isn't. It's the gap between what the code appears to do and what it actually does under conditions nobody thought to test.

Vibe coding doesn't just risk reproducing this pattern. It makes the pattern the default workflow.


This Is Not Anti-AI

Using AI tools to write software is not the problem. I use them. Every serious developer uses them. The question is not whether you use AI — it's whether you understand what the AI produced.

There is a difference between using AI to accelerate work you understand and using AI to replace understanding you don't have. The first is a productivity tool. The second is a liability.

The dangerous version of vibe coding is not Karpathy experimenting on a weekend project. It's a developer who has never learned to reason about concurrency using AI to write multithreaded payment processing code, accepting the output because it compiled, and deploying it because nobody stopped them.

In a system that processes 1,400 financial events per second — the kind of system I've worked on — a race condition doesn't produce an error message. It produces silent data corruption. It produces balances that don't match. It produces regulatory incidents. It produces the kind of failure that takes weeks to diagnose and costs more than the entire team's annual salary to remediate.

The AI that wrote the code won't be in the post-mortem. The developer who deployed it will be. The engineering manager who approved it will be. And when someone asks who understood what that code was doing before it went to production, nobody will have a good answer.


Who Carries the Risk

Software runs in hospitals. It runs in aircraft. It runs in the financial infrastructure that millions of people depend on without knowing it exists. It runs in systems that keep people alive.

The people whose lives depend on those systems are not in the room when the code is written. They don't know whether the developer who built the system understood it. They don't know whether the reviewer read it carefully. They are trusting, without knowing they're trusting, that someone in that chain took the responsibility seriously.

Vibe coding is a choice to opt out of that responsibility. To build something you don't understand and deploy it into a world where other people bear the consequences of your choices.

For a weekend project, that's a personal decision. For production systems that affect real people — it's a different kind of choice entirely.

The next major software failure caused by AI-generated code that nobody understood is not a hypothetical. It is already being written. Somewhere, right now, a developer is accepting output they didn't read, from a model that doesn't know the domain, in a system where the failure mode is measured in lives or money or both.

The only question is whether anyone in the review chain will stop it.

FAQ

Vibe coding is a term coined by Andrej Karpathy in 2025 to describe AI-assisted development where the developer describes what they want in natural language, accepts the generated code without fully reading or understanding it, and moves on. It works for low-stakes projects. It becomes dangerous in production systems.

No. For personal projects, prototypes, and low-stakes tools, vibe coding is a legitimate productivity approach. The danger emerges when this mindset is applied to systems where failures have real consequences — financial infrastructure, healthcare, aviation, or any system where other people bear the cost of the code being wrong.

Writing robust code for critical systems requires deep understanding of the business domain, the history of the system, edge cases that aren't documented, and the consequences of specific failure modes. AI models don't have this context. They generate statistically plausible code based on training data — which includes a significant amount of low-quality public code.

Use AI tools to accelerate work you already understand — not to replace understanding you don't have. Review every line of AI-generated code as if you wrote it yourself, because in production, you own it. In critical systems, never approve code you don't understand.

Yes. Multiple organisations have reported production incidents attributed to AI-generated code that introduced security vulnerabilities, race conditions, and logic errors that weren't caught in review. The full scale of the problem is not yet documented, partly because post-mortems rarely identify "AI-generated code" as the root cause — they identify the specific failure, not its origin.

Share

XLinkedIn
← Back to all articles