Skip to content

Judgment in Architectural Choice

By now the major parts of an AI coding system are all on the table — how the model works, how an agent executes, how MCP and Skill extend its capabilities, what memory and context engineering and knowledge injection each solve. The question that finally gets practical is the next one: once you know all the pieces, when you face a real scenario, how do you actually choose between them, configure them, and assemble something that runs?

The problem in many teams is not "we don't know about the new things." It is that they are too quick to use all the new things. A task that one agent and a few built-in tools could have handled cleanly ends up wrapped in multi-agent orchestration, RAG, MCP, and a spec system all at once. The architecture looks impressive, the cost is alarming, and the results are not necessarily any better. What is actually scarce is not tools — it is judgment. Knowing when to add a layer, and knowing when removing one is the better engineering decision.

This part is about that capability — the move from "knowing the concepts" to "being able to choose." The first two chapters tackle the decision side: when a plain conversation is enough, when you need an agent, when MCP or Skill or spec-driven workflows are worth the cost, and when adding nothing is the better engineering call. The next two chapters push from "what to choose" toward "how to assemble it" and "how to keep it from drifting": one lays out the end-to-end blueprint from a user request to delivered code, putting the pieces we discussed separately back into a single system view; the other turns to security and alignment, and explains why — once the system starts touching real codebases, real documents, and real production environments — the trust boundary becomes part of the architecture itself.

The keyword for this part is not components. It is judgment. Not adding a new piece every time you hear about one, but being able to tell necessary complexity apart from conceptual excitement; not just optimizing one prompt or one tool in isolation, but being able to look at an entire AI coding system and decide whether it should be designed this way at all — and where its real edges are.