Vibe Coding in Practice

"Just learn to prompt"

I was very annoyed with the recommendations of learning to prompt to use AI. That is not how it was supposed to work! As a Software Engineer, I've been learning how to translate user requirements into working solutions that offer consistent results. That is, how to provide instructions to a machine (through a programming language, systems, platforms, etc.) that work every time—and that is already a struggle. The environment and user requirements are always changing.

Now with the recommendation of "learning to prompt," things are slightly worse, because it's like programming but in a natural language, except the computer returns different results each time! The appeal of using natural language to program has always been there, and by learning how to prompt, it seems we are getting closer, except that now we have to "code" in an ambiguous language that returns ambiguous results.

Lets add to that a large codebase, or when the task is complex, and the computer can get into infinite loops. Also, because the AI always tries to complete the task, it sometimes claims the task has been completed with a lot of celebratory emojis, only to find out that everything is hardcoded, empty, or just plain wrong. The computer has hallucinated.

I still haven't seen a case where, in order to complete the task, the AI deletes a test that might be failing, but I'm always scared of it.

Giving it a fair try

Last month, I attended the AI4DEVS conference with the expectation to learn how this problem is solved, or see if I was doing things wrong. There were many takeaways, but the one that struck me the most was the presentation from Anton Arhipov "Junie Deep Dive" where he showed how to do this prompt engineering thing in practice. I took notes, put my skepticism on the drawer, and the I decided to give that approach a fair chance.

Having a clear expectation of the capabilities of the AI tools is important. These are not actual AI systems in the strict sense; they don't have intelligence. They are LLMs, Large Language Models —that is, they are a representation of language. Understanding that they are just a tool is crucial because, just like any other new tool, the human is still in control of the process and is responsible for the outcome. We are still learning to use them, they are changing all the time, and they will be different from the tools we are used to.

Prompt Driven Development aka Vibe Coding

Create a plan

The outcome of the first prompt is a `xyz_plan.md` file with the understanding of the task. Many tools are integrating this into their solutions already, but having it as a separate file has the benefit that you can read it, adjust it, save it, version it, etc. This is important because at some point the LLM will run out of context, and having the plan saved allows you to go back and continue.

Create a task list

Once the plan has been agreed, and questions and assumptions have been clarified, the next prompt is to create a task list that can be marked as complete as things progress. Specifying the format is useful, something like:

"Create a tasks list `xyz_task.md` with numbered check boxes, e.g., `1. [ ] fix foo`"

Again, review the list, clarify any outstanding questions, and approve the file.

- **Ask to execute the task one at a time**

This is when you let the tool do the work, but having the previous two files allows it to have more context of what is needed. At this point, it is useful to provide more information about a given feature that is being implemented. Insist that the tool ask you for questions if things are not clear.

Vibe!

Keep an eye on the tool, though. Detect when it's getting into loops and ask, "Do you have any questions?" Also, if you have an allow list, avoid anything that is destructive like delete commands or access to git. It is better to just be there ready to approve (or deny) a request rather than let it delete your computer by "mistake." Other tools are fine, and you can add them as they go.

Create a living `guidelines.md` file and keep it small but complete

As the codebase progresses and you get a feel for how this LLM works on this particular project, you might want to create a `guidelines.md` file to explain how to work with the codebase. This is similar to the `CONTRIBUTING.md` file many projects have, but it's kept focused on the development cycle with LLMs.

Of course, you can use an LLM to create this file, and you have to revisit it every time a new practice emerges. Also, LLMs love to be verbose, so you might need to ask it to trim it; after all, you don't want to spend all your context space there.

Include references to other important documents like the high-level design, and clarify on what the definition of done is. For example: The acceptance criteria is met, all thetests pass, new tests are written for new features. If there are regressions, call them out. If there are more than N number of regressions, ask the user (you) to decide what to do instead of trying to fix them and going back and forth.

Create additional tools that help you to identify problems or regressions

It is worth spending some time creating custom tools that can be run as part of the cycle, for instance, integration tests. This is slightly different from the code created as part of the normal cycle (of course you can create these tools with an LLM). The main goal of these auxiliary tools is to detect regressions, and the source would live outside of the regular codebase (perhaps just in a separate folder). The LLM can create and update normal tests, but this tool won't be part of the workflow.

In the specific test project I'm working on, this tool executes the binary created over a set of files and moves them from `test/passing` to `test/regressed` if they fail. That way I can tell, outside of the prompt session, what happened and why.

Iterate and ask questions, request the tool to ask you questions

Just like any piece of software, there are things that are discovered during the implementation. There are things that couldn't been known from the static analysis and only emerged when the test suite was run, or on many occasions the prompt or the requirement is contradictory or ambiguous (just like in any other kind of software development). Ask the tool why is it taking x or y desicion when you see it is going in the wrong direction. Ask it to ask you for questions instead of trying to solve things by itself. Feel free to create subtasks and a sub-plan if the task seems too big to handle or there are many phases in it.

Remove technical debt

LLMs like to be verbose. Often they create files that are not used, and tests are duplicated between sessions. An LLM prefers to add yet another `if` case to an already very large sequence of created `if/else` statements. By having a few sessions dedicated to remove this duplication, you make the analysis for new tasks easier. There are fewer things to understand, and code becomes easier to maintain.

Review and understand the code before committing

Even though this is not strictly vibe coding, if used in a project with other developers, you're still responsible for the code that is being submitted. This might sound cumbersome, but it would make you make smaller incremental changes that can still be understood. In my test project, I haven't had the need to follow this strictly—this is a pet project after all—but it will definitely be crucial in a multi-developer codebase.

It is still Software Engineering after all

It turns out many of the practices we already follow to develop software still apply:

Create a high-level requirement or design and a high level plan
Refine it in stories, tasks and subtask
Add tests and make them pass
Clarify questions
Iterate and adjust the plan or tasks, or scope
Reprioritize task execution order
Remove technical debt

You are still writing the software, and just like before, instead of punching cards, you use a compiler to create the machine instructions for you. Now, instead of writing the high-level code, you're using natural language to have the tool write it for you.

Summary

LLMs are new tools; they don't have intelligence, and we are still in control and we are still responsible for the outcome.

We are still learning how to use these tools, but following and adapting to them would give us better results. All these things might change in a few months when new things are released; many tools are already building these guardrails inside the tool.

The tool has a limited "memory," and it will stop understanding after a few hours of use. That is when creating a new session with the saved plan / task / progress saves you a lot of time.

Remember, keep positive language, use .md files iteratively, review them, adapt them, refine questions, keep an eye on the output, and don't allow destructive actions like delete, move, or commit.

I've been using this approach to implement a very large and complex project that I've been wanted to complete for years. It is still work in progress but the result so far have been great! I noticed though as the code grows new features are more time consuming to add, but this process keeps showing good results.