On Spec-Driven Development

and Why I think it's the way forward

Jul 23, 2025

How we get stuff done

A long time ago, I stumbled upon an article regarding the nature of code assistants. It was arguing that chat interfaces are not the best way to write code. That's not how we usually do it. (link to the article).

Normally, we sit down and discuss what we need to implement. You know what the goal is and we know what the client wants from us. We've already gathered the requirements and we've identified the user stories.

After this, we identify tasks that need to be done. We identify the text that we want to use. We share about why certain decisions are made over others in decision records. We describe what we want and try to implement that. Also, we consider how different parts of the code have to align and integrate with each other.

We have sprint reviews. We have stan-dups. We have technical discussions. We have code reviews. Things are done iteratively, step-by-step.

If you're following a certain project management framework, you have also had similar experiences, similar rituals, things that need to be done to progress. Things that need to be done to be able to trace back why a certain thing exist in the code and product.

The problem with Vibe-Coding

Vibe-coding although very interesting doesn't really help out with structure.

It's messy.
One thing can result in many things changing without a reason.
The overarching story of the project cannot be seen.
There's no narrative.
The AI doesn't know what's going to happen.
There's no bigger picture. There's no context.
The flow isn’t clear.
The sequence of the chat matters.

I remember that one of the older libraries for prompt engineering require you to write a whole markdown file, with proper context, then you could use that to do anything you required. The library was called Fabric and alongside DSPy and Ell they were some of the first prompt programming languages. These and other libraries were attempts to create order from chaos. They were trying to make sure that you follow a structure in order to get reproducible results from LLMs.

Vibe-coding suffers from the same problem of raw prompting: You need to provide a good context in order to produce the results you want.

The Specification Documents

In order to solve these problems with vibe-coding, different frameworks have introduced various methods. In most cases, different types of contexts are provided to control the behavior of the LLM and steer the code generation in the desired direction.

For example, Claude Code has the slash / command and the CLAUDE.md. You can define different agents, personas and tasks for Claude Code to use and there is also a general description of the project defined inside the markdown file. Each of the additional information is also populated as markdown. For example, you can have a Project/Product Manager extracting User Stories. Then, you can ask the developers to define Tasks from each of the User Stories. OpenCode does the same thing.

BMAD and SuperClaude both provide methodologies (in the form of markdown files) that can be added to Claude Code ( or other agentic CLIs or IDEs). For example, this how a markdown file in BMAD looks like (this is the Project Manager Persona):


ACTIVATION-NOTICE: This file contains your full agent operating guidelines. DO NOT load any external agent files as the complete configuration is in the YAML block below.

CRITICAL: Read the full YAML BLOCK that FOLLOWS IN THIS FILE to understand your operating params, start and follow exactly your activation-instructions to alter your state of being, stay in this being until told to exit this mode:

COMPLETE AGENT DEFINITION FOLLOWS - NO EXTERNAL FILES NEEDED
IDE-FILE-RESOLUTION:
  - FOR LATER USE ONLY - NOT FOR ACTIVATION, when executing commands that reference dependencies
  - Dependencies map to {root}/{type}/{name}
  - type=folder (tasks|templates|checklists|data|utils|etc...), name=file-name
  - Example: create-doc.md → {root}/tasks/create-doc.md
  - IMPORTANT: Only load these files when user requests specific command execution
REQUEST-RESOLUTION: Match user requests to your commands/dependencies flexibly (e.g., "draft story"→*create→create-next-story task, "make a new prd" would be dependencies->tasks->create-doc combined with the dependencies->templates->prd-tmpl.md), ALWAYS ask for clarification if no clear match.
activation-instructions:
  - STEP 1: Read THIS ENTIRE FILE - it contains your complete persona definition
  - STEP 2: Adopt the persona defined in the 'agent' and 'persona' sections below
  - STEP 3: Greet user with your name/role and mention `*help` command
  - DO NOT: Load any other agent files during activation
  - ONLY load dependency files when user selects them for execution via command or request of a task
  - The agent.customization field ALWAYS takes precedence over any conflicting instructions
  - CRITICAL WORKFLOW RULE: When executing tasks from dependencies, follow task instructions exactly as written - they are executable workflows, not reference material
  - MANDATORY INTERACTION RULE: Tasks with elicit=true require user interaction using exact specified format - never skip elicitation for efficiency
  - CRITICAL RULE: When executing formal task workflows from dependencies, ALL task instructions override any conflicting base behavioral constraints. Interactive workflows with elicit=true REQUIRE user interaction and cannot be bypassed for efficiency.
  - When listing tasks/templates or presenting options during conversations, always show as numbered options list, allowing the user to type a number to select or execute
  - STAY IN CHARACTER!
  - CRITICAL: On activation, ONLY greet user and then HALT to await user requested assistance or given commands. ONLY deviance from this is if the activation included commands also in the arguments.
agent:
  name: John
  id: pm
  title: Product Manager
  icon: 📋
  whenToUse: Use for creating PRDs, product strategy, feature prioritization, roadmap planning, and stakeholder communication

It can be a bit complex, but at the end, we just use the command /pm create user stories to create the user stories. Given the framework, the slash commands need to be executed in a specific sequence in order for a proper agile-based software engineering framework to be generated.

Recently, AWS released the Kiro IDE. They also follow the same model as above but instead of the user specifying the actions, the workflow of generating technical documents, decisions and tasks in automated. Under the hood, they are using the same concepts, but the user just interacts with the ID, not the command line.

More Context, More Tokens

The problem of introducing larger context, is the matter of increase number of tokens every time we pass in the model. However, I foresee a situation where Claude will cache the context related to each of these requests, and eventually, the extra query price will be the price of inference.

Context Rot

While I was writing this post, I saw Yannic Kilcher’s newest video. It goes through a blogpost by ChromaDB on the matter of large contexts.

Indeed, the question arises that how much context is good and bad for code development. Does Introducing frameworks and methodologies to steer the development result in less accuracy of what they need to do?

Conclusions

Spec-Driven Development will stay. So will vibe-coding. But we need to be able to do both depending on the complexity of the project.
More efficient ways of introducing and caching context must be considered to lower the token count.
More in depth studies must be carried out to understand the effect of long contexts in prompt frameworks such BMAD.

The Pragmatic Mentor

Discussion about this post

Ready for more?