Vibe Coding, 3 months later

Pausing

After 220+ commits and around 32k lines of generated code, I paused my vibe coding experiment. Around the 25k LOC mark, doing any modification became increasingly difficult. I tried refactoring the code several times to make it easier to modify but with no luck; some sessions turned into abysmal regressions and others, after many hours, into very small improvements.

Domain

The project is a programming language compiler, so the domain is well known. It also allows one to test it directly by running the generated compiler without having to launch the browser; for instance, regressions were easy to test: a source file that used to compile now doesn’t compile anymore, etc.

Problems

In general, it was very good until I reached a project size that the tool (Cursor) couldn’t keep everything in memory and/or, after several hours of running, it started showing problems, even when the content was summarized.

While the domain is well known, the programming language itself is rather innovative (classes, functions, and threads are represented by a block of code). So, it is not like most programming languages; this context had to be explained every time to the tool by referencing the docs.

The other aspect that made me stop is that, at some point, my own lack of knowledge in compiler theory didn’t allow me to specify smaller tasks. Everything was thrown into the code generation file, and debugging was getting harder and harder for me to help the tool.

This is/was a side project, so I did this after work; late at night, I had very little energy to debug or understand what the tool was generating (in addition to my lack of knowledge). The sessions were not short (4–6 hrs). So, at least as of 4Q 2025, the tool is not quite there yet.

Patterns

Regardless, I think using Software Engineering principles (unit-testing, iterative approach, understanding the domain, etc.) gives us experienced developers a huge advantage over non-engineers / junior developers. I think the tool has to mature to make better use of static analysis (e.g., index the source code to know where everything is) and let the “AI” higher understanding feature do the actual modification. More often than not, the tool spent a lot of time trying to figure out where some functionality was added by using grep or other approaches; sometimes it got it right, sometimes it missed something. This is where working on a single codebase for a long while makes a human better: we get familiar with where things are and we can refactor/test more intuitively. I think this is also where we introduced bugs, to be honest. The tool, however, “forgets” within a couple of sessions; it was kind of weird to have a very successful session one day, only to have to walk it through the whole concept again the next day (or even later in the night).

TL;DR

Vibe coding works for small(ish) scale projects; I suspect ~10k LOC.
The developer still needs to fully understand what is going on to help the tool with better suggestions.
The tools need to get better at static code analysis so they don’t have to spend cycles trying to remember where things are.
It is frustrating to spend 5 hrs vibe coding one day just to start from (almost) scratch the next day: “Look at the @file_summary_you_made_yesterday.md to understand more of the background for this project…”