The Candidate Lifecycle: When AI Models Your Domain, Who Confirms It?
This post marks a shift.
Posts 1 through 5 gave away a methodology — Signal-Driven Development, the gap report, the three-pass convergence process. That was the community gift. Use it, fork it, adapt it.
What follows is different. Over the next four posts, I'm formalizing patterns that emerged from practicing SDD rigorously with an AI collaborator across production domains. These aren't theoretical observations. They're problems I hit, named, and solved — problems that every team incorporating AI into domain modeling will encounter.
Nobody in the DDD community is publishing on these patterns, because nobody else has built a system that does AI-mediated domain modeling at this depth. That's not a boast. It's an observation about where the field is. Evans is writing about integrating AI components into deterministic systems. DDD Europe 2026 has workshops on LLM-assisted strategic design. The community is exploring the intersection. I've been living in it for months.
The first pattern is the Candidate Lifecycle. It answers a question that traditional DDD never needed to ask.
The Implicit Consensus Problem
In traditional DDD, knowledge crunching is collaborative. A group of people — developers, domain experts, architects — sit in a room, put stickies on a wall, argue about language, and iterate toward a model. The model that survives the room is the model the team has agreed on.
The agreement is implicit. Nobody votes on whether OrderPlaced is the right event name. Nobody signs off on the aggregate boundary. The team converged through conversation, and the model reflects that convergence. When the developer writes code that mirrors the model, they're implementing a shared understanding — imperfect, informal, but collectively owned.
This works because the feedback loop is human. When a domain expert says "that's not how we think about it," the model changes. When a developer says "I can't implement this boundary cleanly," the team revisits. The model is continuously validated by the people who created it.
Now replace the room with an AI.
You feed a product requirements document into a language model. The model produces a domain specification — bounded contexts, aggregates, events, commands, invariants. The specification is structurally plausible. The names sound right. The boundaries look reasonable. The event flows make sense.
But who agreed to this model?
The AI doesn't have domain expertise. It has statistical pattern completion trained on millions of documents that include DDD examples, software architecture discussions, and domain modeling content. When it names an aggregate OrderFulfillment and places it in a Logistics bounded context, it's not making a domain judgment. It's producing a statistically likely output given the input and its training distribution.
The model might be right. It might be excellent. But the mechanism that produced it is fundamentally different from the mechanism that produces a model in a collaborative workshop. There was no implicit consensus. There was no domain expert pushback. There was no developer saying "this boundary feels wrong." There was a prompt, a completion, and a plausible-looking output.
This is the implicit consensus problem: traditional DDD's trust model breaks when AI generates the domain model. The trust was always embedded in the process — we trust the model because we built it together. When AI builds it, that embedded trust doesn't exist.
The Candidate Lifecycle
The Candidate Lifecycle is a design pattern for AI-mediated domain modeling. It establishes an explicit trust boundary between AI-generated output and confirmed domain knowledge.
The core principle: nothing an AI produces is domain knowledge until an architect explicitly confirms it.
Every domain artifact that an AI generates — every aggregate, every event, every bounded context boundary, every invariant — enters the system as a candidate. A candidate is a proposal with provenance. It carries what was proposed, why it was proposed (the AI's reasoning), what alternatives were considered, and what produced it (which model, which prompt strategy, which input).
A candidate is not part of the domain model. It's a proposal to the domain model. The domain model only changes when the architect does one of three things:
Confirm: The candidate is correct. The AI's classification matches the architect's domain understanding. The aggregate should exist, the boundary is right, the event name is appropriate. The candidate becomes confirmed domain knowledge.
Override: The candidate is partially correct but misclassified. The AI identified a real domain concept but categorized it wrong — it proposed an invariant where the architect recognizes a policy, or it placed a service in the wrong bounded context. The architect corrects the classification while preserving the underlying insight. The override is recorded with rationale.
Reject: The candidate is wrong. The AI hallucinated a concept, misinterpreted the input, or produced something that contradicts the architect's domain understanding. The rejection is recorded with rationale — because the rejection is itself domain knowledge. It documents what the domain is not.
The confirmation, override, and rejection are the new design surface. In traditional DDD, the design surface is the whiteboard — where you draw boundaries and name things. In AI-mediated DDD, the design surface is the confirmation boundary — where you decide which AI proposals become trusted domain knowledge and which don't.
Why Provenance Matters
Every candidate must carry provenance — metadata about what produced it and why. This isn't an implementation detail. It's a domain requirement.
Provenance answers the question that every architect will eventually ask: "Why does this aggregate exist?" In traditional DDD, the answer is "we decided in the March workshop." In AI-mediated DDD, the answer must be traceable: this aggregate was proposed by this model version, using this prompt strategy, in response to this input, with this reasoning, and it was confirmed by this architect on this date.
Provenance serves three purposes.
Auditability: When a domain model is the foundation for a production system, someone will eventually need to reconstruct why a particular decision was made. Provenance provides the full chain — from the input document, through the AI's interpretation, to the candidate proposal, through the architect's confirmation. Every link in the chain is recorded.
Reproducibility: When you update your AI model or change your prompt strategy, you need to understand what changed. Provenance lets you ask: "Which candidates in this specification were produced by the previous model version? Have any of them been invalidated by the updated model's output?" Without provenance, model upgrades are blind — you can't tell which parts of your domain model might be affected.
Trust calibration: Over time, provenance data reveals patterns about AI reliability. Which types of domain concepts does the model classify well? Where does it consistently struggle? Provenance transforms individual confirmation decisions into aggregate insight about the AI's modeling capability. This is how the trust boundary becomes data-driven rather than faith-based.
The Evans Connection
At Explore DDD 2024, Evans framed a trained LLM on a ubiquitous language as effectively a bounded context. It has its own model of the domain, shaped by its training data and fine-tuning. It speaks a language that overlaps with but isn't identical to the domain expert's language.
This is a powerful framing. And the Candidate Lifecycle is the answer to the question it raises: how does knowledge from the AI's bounded context become trusted domain knowledge in the architect's model?
In traditional DDD context mapping, we have patterns for this. When two bounded contexts need to share knowledge, we use patterns like Published Language, Anti-Corruption Layer, Customer-Supplier, or Conformist. Each pattern defines who owns the translation, who controls the contract, and how mismatches are handled.
The AI's "bounded context" needs the same treatment. The AI produces output in its own model. That output must cross a trust boundary before it enters the architect's domain model. The Candidate Lifecycle is the translation mechanism — it's the Anti-Corruption Layer between the AI's statistical model and the architect's domain model.
Evans is now writing explicitly about this pattern — drawing Anti-Corruption Layers between deterministic application code and probabilistic LLM output. We arrived at the same architectural conclusion independently. The AI's output must be constrained, translated, and explicitly accepted before it enters the deterministic system. The Candidate Lifecycle formalizes the acceptance mechanism for domain modeling specifically.
What This Changes About DDD Practice
The Candidate Lifecycle has implications that go beyond the obvious.
The architect's role changes. In traditional DDD, the architect is a creator — they model the domain through collaborative discovery. In AI-mediated DDD, the architect becomes a curator and a judge. The AI generates candidates at a speed and volume that no human modeler could match. The architect's job is to evaluate, confirm, override, reject, and document. The creative act shifts from "invent the model" to "validate the model and improve it."
This isn't a lesser role. It's a more rigorous one. The architect who evaluates fifty AI-proposed candidates and confirms thirty, overrides twelve, and rejects eight has made fifty explicit domain decisions — each documented with rationale. The architect who draws a model on a whiteboard has made the same decisions implicitly, with no record of what was considered and rejected.
Rejection becomes a first-class artifact. In traditional DDD, rejected ideas are lost — they exist only in the memory of the people who were in the room. In the Candidate Lifecycle, every rejection is recorded with rationale. "This was proposed as an aggregate, but it has no invariants and no independent lifecycle — it's a value object" is domain knowledge. It documents what the domain is not, which constrains future modeling decisions and prevents the same mistake from being proposed again.
The speed of iteration changes. A three-pass convergence that might take weeks with a human team can happen in hours with AI generating candidates and an architect curating them. But the curation can't be automated — that's the whole point. The AI proposes, the architect decides. The speed gain is in generation, not in judgment.
Provenance creates institutional memory. When the architect who confirmed a set of candidates leaves the team, the provenance chain remains. The next architect can reconstruct not just what was decided, but why — including the AI's reasoning, the alternatives that were considered, and the rationale for each confirmation and rejection. This is better institutional memory than most teams have ever had for their domain models, because it was captured at decision time rather than reconstructed after the fact.
The Pattern in Practice
If you're incorporating AI into your domain modeling process today — whether through ChatGPT, Claude, a fine-tuned model, or a purpose-built system — here's how to apply the Candidate Lifecycle manually:
Mark everything the AI produces as provisional. Don't copy AI-generated domain concepts directly into your specification. Create a separate "candidates" section. Each candidate gets an ID, the AI's proposed classification (aggregate, event, policy, etc.), the AI's reasoning if available, and a status: pending, confirmed, overridden, or rejected.
Review candidates in bounded context order. Start with context boundaries. Then aggregates within each context. Then events and commands within each aggregate. Confirmation cascades downward — confirming a bounded context doesn't confirm its aggregates, but rejecting a bounded context rejects everything inside it.
Document every override and rejection. The override rationale ("this was proposed as an invariant but it's actually a policy — it reacts to events rather than constraining state") is more valuable than the confirmation rationale. Overrides and rejections are where your domain understanding diverges from the AI's pattern matching. They're the signal.
Track which model version produced which candidates. When you update your AI model or change your prompting approach, you need to know which parts of your domain specification were produced under the previous configuration. Provenance doesn't need to be sophisticated — "GPT-4o, March 2026, prompt v2" is sufficient for manual tracking.
Run the gap report after confirmation. The gap report (Post 5) evaluates the confirmed specification, not the raw AI output. Gaps found post-confirmation are real gaps in the architect's curated model — not noise from unreviewed AI proposals.
What Comes Next
The Candidate Lifecycle establishes the trust boundary. But it assumes the candidates are at least structurally valid — that an aggregate is an aggregate, that a policy is a policy, that the AI's classification is correct even if the domain judgment is wrong.
What happens when the classification itself is wrong? When the AI proposes something that passes every structural check, looks correct in every gap report, and produces a domain model that appears complete — but the behavioral semantics are fundamentally broken?
That's the Classification Gap. Post 7.
This is Post 6 of a series on DDD, AI, and the methodology that emerged from practicing both rigorously. Post 5 delivered the gap report deep dive and the SDD repository. The series continues with the Classification Gap in Post 7.
Get new posts in your inbox. No spam, unsubscribe anytime.