Vibe Coding Is Dead — And Your Startup Probably Shipped a Security Disaster

Andrej Karpathy — OpenAI co-founder and former Tesla AI lead — coined "vibe coding" in February 2025, describing a style of development where you "fully give into the vibes," accepting AI-generated code without necessarily understanding every line. Collins Dictionary named it Word of the Year for 2025. The tools exploded: Cursor, Replit, Bolt, Lovable, and Claude Code attracted billions in venture funding. GitHub reported that 46% of all new code committed today is AI-generated. In Y Combinator's Winter 2025 batch, 25% of startups had codebases that were 95% or more AI-generated. The vibe was immaculate.

Fourteen months later, the hangover has arrived. And Karpathy himself declared vibe coding obsolete — not because the tools don't work, but because the industry moved to something better and harder: agentic engineering, where developers orchestrate AI agents rather than blindly accept their output. The data explains why the shift was necessary.

Key Takeaway

Vibe coding — describing what you want and shipping whatever AI generates — is producing catastrophic security and reliability problems. The numbers: 40-62% of AI-generated code contains security flaws. Cross-site scripting protection fails 86% of the time. 35 new CVEs in March 2026 alone were directly caused by AI-generated code. Amazon had 4 critical service disruptions in one week from AI-coded deployments. The speed gains are real. The cost is security, maintainability, and technical debt that compounds invisibly until production explodes.

The Security Numbers Nobody Wants to Talk About

The data on AI-generated code security is unambiguous and alarming. Security firm Tenzai built 15 identical applications using five popular vibe coding tools — Claude Code, OpenAI Codex, Cursor, Replit, and Devin. The result: 69 vulnerabilities across those applications. Six were critical — meaning they could be exploited to gain unauthorized access, steal data, or take control of systems. This wasn't testing obscure edge cases; these were standard web applications built with standard prompts.

Broader studies confirm the pattern. Between 40% and 62% of AI-generated code contains security flaws, depending on the study and the tool. AI fails to protect against cross-site scripting (XSS) 86% of the time — one of the most basic and well-understood web vulnerabilities. AI-authored pull requests show 2.74 times higher security vulnerability rates than human-written code. In March 2026 alone, 35 new CVEs (Common Vulnerabilities and Exposures) were directly attributed to AI-generated code — up from 6 in January. The trend line is accelerating as more AI-generated code reaches production.

The Amazon incident crystallized the problem for enterprise audiences. According to the Financial Times, an Amazon service outage in December was caused by an AI coding bot. The company subsequently experienced four critical incidents in a single week. An internal Amazon memo acknowledged that safeguards "aren't yet fully established" — a remarkable admission for one of the world's most sophisticated engineering organizations. Amazon now requires senior engineers to sign off on any AI-assisted code changes made by junior and mid-level engineers. The company that pioneered cloud computing at scale was forced to add human gatekeepers specifically because AI code couldn't be trusted.

Code quality metrics tell the same story from a different angle. Code churn — the rate at which code is written, committed, and then rewritten — is up 41%. Code duplication has increased fourfold. The careful refactoring that keeps codebases healthy over time has collapsed from 25% of changed lines in 2021 to under 10% by 2024. A January 2026 academic paper argued that vibe coding is "quietly killing open source" by reducing developer engagement with the maintainers who keep critical infrastructure running. When developers stop reading code because the AI generates it, they also stop contributing to the community projects that their code depends on.

Why Speed Without Understanding Creates Time Bombs

The fundamental problem with vibe coding isn't that AI generates bad code — it's that developers ship code they don't understand. When a human writes a vulnerability, the human understands the surrounding code well enough to find and fix the problem during debugging. When AI generates a vulnerability, the developer who prompted it often can't identify the problem because they never understood the code's logic in the first place. The bug becomes a black box inside a black box.

This creates compound technical debt. Each piece of AI-generated code that the developer doesn't fully understand adds another opaque layer to the system. When these layers interact — and they always do, eventually — the resulting bugs are extraordinarily difficult to diagnose because nobody on the team has a mental model of how the system actually works. They only know what they told the AI they wanted. The gap between intent and implementation grows silently until production fails in ways no one can explain.

The credit burn problem makes this worse. One analysis from app builder communities found that Lovable users burned 400 credits on bug-fixing alone — meaning they spent significant resources fixing code that the AI generated incorrectly, using the same AI to attempt fixes, generating new problems in the process. This cycle — generate, discover bug, prompt AI to fix, introduce new bug, repeat — is the dark side of AI-assisted development. Each round burns credits or compute time, and the codebase accumulates layers of patches on top of patches that no human has reviewed holistically.

📬 Getting value from this?

One actionable AI insight per week. Plus a free prompt pack when you subscribe.

Subscribe free →

What Replaced Vibe Coding (And What Actually Works)

The industry bifurcated in early 2026 along a predictable line: experienced developers using AI tools saw genuine productivity gains of 10-30%, while inexperienced developers using the same tools produced more output with worse quality. The difference isn't the tool — it's whether the human understands what the AI generates.

Experienced engineers use AI coding tools as accelerators for well-understood patterns: CRUD operations, API integrations, data formatting, utility functions, boilerplate. They review the output, understand its implications, and catch security issues before committing. The AI saves time on implementation; the human provides judgment, architecture, and quality assurance. This is what Karpathy now calls "agentic engineering" — orchestrating AI agents rather than accepting their output uncritically. The 10-30% productivity improvement for properly governed AI coding is real and sustainable.

Non-developers who tried to build production software through pure prompting — the original vibe coding promise — hit maintenance walls within weeks. Reddit data from builder communities shows a "reverse migration" pattern: users who left no-code platforms for AI coding tools returned to visual builders after experiencing the maintenance burden of AI-generated code. The platforms that combine AI assistance with structured visual building are emerging as the pragmatic middle ground for non-developers.

For developers, the practical takeaway is clear: AI coding tools are transformative when paired with engineering judgment. They're disastrous when used as a substitute for engineering judgment. The only AI skill that matters applies here as much as anywhere: the ability to evaluate AI output and exercise judgment about whether it's correct, secure, and appropriate for production. The free Prompt Optimizer helps write more specific coding prompts that produce better first-attempt output, reducing the iteration cycles that compound quality problems. For one-click optimization inside ChatGPT, Claude, and Gemini, TresPrompt brings it directly to your workflow.

📬 Want more like this?

One actionable AI insight per week. Plus a free prompt pack when you subscribe.

Subscribe free →

Frequently Asked Questions

Is vibe coding always bad?

No — it's bad for production systems. For prototyping, exploring ideas, and learning, describing what you want and seeing AI generate it is genuinely useful. The problem is when prototyping code ships to production without review, security testing, or human understanding of its logic. Vibe coding as exploration is fine. Vibe coding as engineering is dangerous.

Is Claude Code part of the vibe coding problem?

Claude Code, like any AI coding tool, can be used responsibly or irresponsibly. What distinguishes Claude Code from pure vibe coding tools is its agentic workflow — it runs tests, analyzes errors, and iterates on solutions rather than just generating code once. But even Claude Code output should be reviewed by a developer who understands the codebase. The tool assists engineering; it doesn't replace it.

Should I stop using AI coding tools?

Absolutely not — the productivity gains are real for experienced developers. The correct response is governance, not abstinence. Review AI-generated code before committing. Run security scans on AI output. Understand the logic of what the AI generates, especially for authentication, authorization, and data handling. Use AI for the 80% of code that follows standard patterns, and write the critical 20% yourself.

How do I make AI-generated code more secure?

Three practices: (1) Include security requirements in your prompts — "ensure input validation on all user-facing fields, use parameterized queries for database access, implement CSRF protection." Specific security instructions produce more secure code. (2) Run automated security scanners (Snyk, SonarQube, Semgrep) on all AI-generated code before committing. (3) Require human code review for any AI-generated code that touches authentication, authorization, payment processing, or personal data handling.

What's the difference between vibe coding and agentic engineering?

Vibe coding: describe what you want → accept whatever the AI generates → ship it. Agentic engineering: define the task → AI generates a solution → AI runs tests → AI identifies failures → AI iterates → human reviews the result → human approves or redirects. The difference is the feedback loop and human oversight. Agentic engineering uses AI as a collaborator; vibe coding uses AI as a replacement.

Disclosure: Some links in this article are affiliate links. We only recommend tools we've personally tested and use regularly. See our full disclosure policy.