Type Construction Architecture

Pydantic is a programming language. Python is its runtime.

A Pydantic model is not a schema. It is a machine with a four-layer construction pipeline that fires every time data enters it. If the object exists, every constraint declared in its type was satisfied. If construction fails, no object exists. There is no third outcome.

Pydantic-as-compute gives you the structural power of algebraic type systems — discriminated unions, product types, newtypes, total construction, compositional reasoning — expressed in Python's vocabulary instead of FP notation. The rigor is the same. The notation is natural language and type annotations, not a symbolic calculus. Any developer can read it. Any neural consumer that works in language can participate in it.

Type Construction Architecture is the discipline of writing programs in these construction semantics. Define the types. Compose proven models as fields. Let projections derive further truth. Let declared dispatch, staged lifting, and model_validate execute the graph. Construction is proof. Derivation extends proof. The program is the construction graph — not the procedural glue around it.

Why TCA Exists

Most software hides the program in a service layer. Raw data arrives, service code interprets it, helper functions map it, branching code classifies it, and passive domain objects carry the results. TCA inverts that arrangement. The program moves into the domain types, and the surrounding layers thin out.

What disappears when the program moves into the types:

Conventional artifact	Why it disappears
Mapper classes, DTO converters, and adapter layers	Foreign schema mirroring and foreign-to-domain lifting turn translation into staged construction
`if/elif` chains that classify inputs	Declared dispatch routes structurally during construction
Service methods that compute from model fields	Composition and projection let models own and derive semantics directly
Intermediate dictionaries and uncertain states	Frozen construction replaces partial translation artifacts with proven objects

The domain types are not passive. They carry the construction logic. They own classification, derivation, and boundary translation. Services shrink to almost nothing because the models already did the work. The app interior is railroaded by constructed certainty.

The Mental Model

Construction is proof. A model_validate call fires the full pipeline: translation, interception, coercion, integrity. If the object comes back, it satisfies every constraint its type declares. No separate validation step.

Frozen snapshots. Every TCA model is frozen. It captures one instant — the state of the world at construction time, proven and sealed. A frozen model never goes stale because it never claims to be current.

Derivation belongs on the machine. If a computation depends only on a model's own proven fields, it belongs on that model as a projection — @computed_field, @cached_property, or @property.

Construction drives further construction. A projection that calls model_validate extends the proof graph. This construction-derivation loop is the evaluation model of a TCA program:

flowchart LR
    C["Construct"] --> D["Derive"]
    D --> C2["Construct"] --> D2["Derive"]
    D2 --> T(("Terminal"))

    classDef step fill:#dbeafe,color:#1e3a8a,stroke:#3b82f6
    classDef done fill:#172554,color:#bfdbfe,stroke:#1e3a8a

    class C,D,C2,D2 step
    class T done

The loop is lazy, deterministic, and compositional.

Procedure has a proper place. Some boundaries resist pure construction — live transport edges, positional data structures, and untyped external surfaces. At those boundaries, a small piece of procedure catches the junk and normalizes it into owned truth. These seams must be irreducible, contained, and terminal. See docs/irreducible-seams.md.

Construction Across Processes

Inside a process, the construct→derive loop is the program. A composed model's @cached_property constructs the next proven object. But the loop doesn't stop at the process boundary.

A domain event is a frozen model. It is a proven fact about something that happened, sealed at construction time. When a service publishes that event onto an event bus, it is transmitting a proven object. When another service receives those bytes and calls model_validate_json, the same guarantee fires — construction succeeds and the consumer holds a proven fact, or construction fails and they know. The proof transfers.

This means TCA and event-driven architecture are the same idea at different scales. Inside a process, construction produces facts and derivation produces further facts. Across processes, services produce facts and other services consume and construct from them. The construct→derive loop becomes the event graph. The transport — whatever it is — makes the boundary between those scales disappear.

What follows from this:

Services don't call each other. They publish proven events and subscribe to proven events. There is no HTTP contract to version, no REST schema to maintain, no client library to generate. The subject namespace on the bus is the contract.
Service mesh becomes pointless. Istio and Linkerd exist to manage services calling services — retries, circuit breaking, mTLS, traffic shaping. If services don't call each other, there is nothing to mesh.
The only REST is at the edge. External consumers (browsers, mobile) still need an HTTP surface. Internally, the API between services is event subjects carrying typed events.

Without event-driven transport, TCA's construction graph terminates at the process boundary — and you are back to the request/response architecture that TCA's internal design already rejected. The example application in this repo (under app/, with compose.yml, justfile, and a sidecar in nats/) realizes this pattern with NATS JetStream; the paradigm does not prescribe a specific transport.

Program Topology

A TCA program has gravitational structure. The densest layer — the scalars — sits at the bottom. Everything above composes from below. Nothing below depends on what is above.

├── main.py                  # composition root
├── config.py                # typed settings
├── api/
│   └── catalog.py           # route: imports contracts from domain
├── service/
│   └── catalog.py           # transport shim: binds transport to active model
└── domain/
    └── catalog/
        ├── type.py          # scalars: dependency root, imports nothing
        ├── value.py         # value objects: composes scalars
        ├── product.py       # frozen model: a proven domain concept
        ├── catalog.py       # active model: single convergence point
        └── api.py           # contracts: domain-owned boundary types

Each layer sees only downward. Every file in domain/ is named for a domain concept — never for a technology pattern, never for a dumping ground. Open the domain directory and read the domain. The file listing is the vocabulary.

See docs/program-topology.md for file roles, cross-context composition rules, and the full dependency graph.

Making LLMs Write TCA

LLMs fail at TCA by default. Their training data is overwhelmingly procedural Python — services, mappers, dict-builders, if/elif routers. Ask an LLM to write TCA code and it will articulate the principles perfectly in conversation, then generate the opposite in code. It writes validators instead of narrowed types. Services instead of model derivations. Mapper classes instead of model_validate. It reaches for the most common pattern from training, not the correct architectural shape.

This repository includes a Claude Code scaffold that solves this problem. Instead of instructing the model to "think in TCA" (which doesn't survive contact with code generation), the scaffold enforces TCA structurally — blocking wrong shapes before they're written and loading correct shapes before generation starts.

The Problem: Training Gravity

An LLM asked to build a composed model will default to six model_validator methods, then rationalize each one as "cross-field." It does this because validators are the obvious Pydantic tool in training data. It won't consider narrowed scalar types with Field(gt=2.0) that carry the proof structurally — because that pattern barely exists in its training corpus.

The same applies everywhere TCA diverges from typical Python:

What you ask for	What the LLM generates	What TCA requires
Domain logic	Service class with methods	Frozen model with `@cached_property` derivations
Gate checking	`model_validator` on one field	Narrowed scalar with `Field()` — type IS proof
Data transformation	Mapper class between models	`model_validate(source, from_attributes=True)`
Input classification	if/elif chain on a string	Discriminated union with `Field(discriminator=...)`
Shared computation	`utils.py` helper functions	`@computed_field` on the owning model

Instructions alone don't fix this. The LLM reads the instruction, agrees, and then writes procedural code anyway — because the instruction occupies one paragraph of context while training data occupies billions of tokens. The fix has to be structural.

The Solution: Three Layers of Constraint

Layer 1 — Deterministic enforcement. An AST-based Python script runs as a post-edit hook on every file write — after the file is on disk, where the full source can be parsed (decorator stacks, method bodies, derivation internals aren't visible in a partial Edit diff). It walks the AST via match/case dispatch — Python's ast module is a sum type, and pattern matching is the right dispatch primitive for it. It mechanically detects known-wrong patterns: json.loads() + model_validate, @computed_field + @property, private methods in domain models, technology-named files in domain/, import direction violations, try/except on frozen models, void -> None methods, @staticmethod/@classmethod on models, multi-value Literal[string] (use StrEnum), 3+ parallel tuple fields, mutables inside @cached_property/@computed_field. No LLM judgment. AST match. Exit 2 blocks. The agent gets stderr feedback naming the exact invariant, class, method, and line that failed.

Layer 2 — LLM enforcement. A prompt hook fires before each edit; an agent hook fires after. They adjudicate against three gate rubrics — Type Integrity, Construction Carries Meaning, and Program Shape. The pre-edit prompt fast-fails on structural patterns visible in the diff (filenames, bare-primitive field types, dict params, mapper/adapter class names). The post-edit agent (Sonnet) reads the rubrics and classifies the full file against allowed/disallowed evidence shapes. The two LLM hooks bracket the deterministic check: structural patterns up front, body-level structure deferred to grep, semantic gating last.

Layer 3 — Intervention skills. Three skills fire before code is written, preventing the wrong design from forming:

Skill	What it prevents	When it fires
`/proof-design`	Reaching for validators before considering narrowed types	Before writing any model that proves invariants
`/shape-match`	Generating procedural Python instead of the correct TCA shape	Before writing any domain file
`/construction-voice`	Procedural language infecting docs and plans, producing procedural code	Before writing any instruction or plan

The layers work in concert. The skills prevent wrong designs. The deterministic hook catches mechanical violations instantly. The LLM hooks catch everything else. Nothing ships without passing all three.

What the Pipeline Looks Like

User prompt
  → Agent works, attempts an edit
    → PreToolUse prompt: LLM fast-fails on structural patterns visible in the diff
    → [edit executes if pre-edit passes]
    → PostToolUse smell.py: deterministic AST match on the full file (instant, no LLM)
    → PostToolUse agent: Sonnet adjudicates against three gate rubrics

The pre-edit LLM catches what's visible in the diff. The post-edit script catches body-level structure the LLM can't reliably infer from a partial edit. The post-edit agent does final gate adjudication on the full file. Three layers, each catching what the others can't.

Gate Rubrics

Three gates evaluate every edit against independent questions. Each gate has allowed evidence, disallowed evidence, approved mechanisms (legitimate exceptions), and escalation triggers. The full rubric is in .claude/rules/gate-rubrics.md.

Gate	Question
Type Integrity	Is every type well-formed — scalars own values, models are frozen, unions are discriminated, constraints are declarative?
Construction Carries Meaning	Does model construction, composition, and derivation do the work — not services, adapters, or coordinator scripts?
Program Shape	Does code live where it belongs — domain types in domain, services are thin transport shims, types flow domain toward edge?

Path-Scoped Rules

Six rule files inject layer-specific constraints when editing files at that layer — what each file IS, what it CONTAINS, what it MUST NOT contain. Rules exist for type.py, value.py, domain/, api/, service/, and main.py.

Building Your Own Scaffold

The entire scaffold — gates, rubrics, invariants, hooks, and skills — was generated by the /bounded-adjudication skill. It walks through a structured worksheet: structural invariants, axes of judgment, evidence shapes, approved mechanisms, genuine ambiguities, and authority topology. The worksheet is the proof artifact for the scaffold's design decisions.

To adapt this scaffold to another project: copy CLAUDE.md and the .claude/ directory, rewrite the project identity, and run /bounded-adjudication to generate domain-specific evidence shapes. See CLAUDE.md for the full protocol.

What's In This Repo

Theory

docs/manifesto.md: Why TCA exists, what we believe, what we reject, and the intellectual lineage
docs/pydantic-machinery.md: How each Pydantic mechanism is load-bearing — the engine under the paradigm
docs/roots-and-proof-obligations.md: What constitutes a root, how to find proof obligations
docs/irreducible-seams.md: Where procedure belongs — the governing test for seams
docs/semantic-index-types.md: When the compilation target reads natural language

Patterns, Topology, and Example

docs/build-patterns.md: 13 before/after build patterns in dependency order — the moves an architect reaches for
docs/program-topology.md: Where each file belongs in a TCA program — the dependency graph and file roles
docs/building-block-classifier.md: Advanced worked example showing construction, dispatch, and seams in a dense recursive program
tca/building_block.py: The classifier implementation — a recursive Pydantic type tree walker

Development Scaffold

CLAUDE.md: Cognitive mode instructions — proof hierarchy, failure modes, wrong/right examples
.claude/scripts/smell.py: Deterministic post-edit enforcement — AST-based, no LLM, instant, unforgeable, runs on the full file after write
.claude/settings.json: Hook pipeline — pre-edit LLM fast-fail, post-edit deterministic script, post-edit LLM gate adjudication
.claude/rules/: Path-scoped rules and gate rubrics
.claude/skills/proof-design/: Forces invariant classification through proof hierarchy before writing models
.claude/skills/shape-match/: Loads correct TCA shape before generation to counteract training gravity
.claude/skills/construction-voice/: Rewrites procedural language into structural declarations
.claude/skills/bounded-adjudication/: The skill that generates the scaffold from a structured worksheet

Requirements

Python 3.12+
Pydantic 2.12+

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.claude		.claude
.vscode		.vscode
app		app
docs		docs
env		env
nats		nats
scripts		scripts
spec		spec
tca		tca
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
compose.dev.override.yml		compose.dev.override.yml
compose.local.override.yml.example		compose.local.override.yml.example
compose.yml		compose.yml
justfile		justfile
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Type Construction Architecture

Why TCA Exists

The Mental Model

Construction Across Processes

Program Topology

Making LLMs Write TCA

The Problem: Training Gravity

The Solution: Three Layers of Constraint

What the Pipeline Looks Like

Gate Rubrics

Path-Scoped Rules

Building Your Own Scaffold

What's In This Repo

Theory

Patterns, Topology, and Example

Development Scaffold

Read Next

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Type Construction Architecture

Why TCA Exists

The Mental Model

Construction Across Processes

Program Topology

Making LLMs Write TCA

The Problem: Training Gravity

The Solution: Three Layers of Constraint

What the Pipeline Looks Like

Gate Rubrics

Path-Scoped Rules

Building Your Own Scaffold

What's In This Repo

Theory

Patterns, Topology, and Example

Development Scaffold

Read Next

Requirements

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages