diff --git a/docs/dev/proposals/929-adapter-lifecycle.md b/docs/dev/proposals/929-adapter-lifecycle.md
new file mode 100644
index 000000000..9aea3021c
--- /dev/null
+++ b/docs/dev/proposals/929-adapter-lifecycle.md
@@ -0,0 +1,538 @@
+# Adapter Lifecycle — Design Proposal
+
+> **Addresses:** [Epic #929 — Fix Intrinsic Adapter Lifecycle & Consistency in Mellea](https://github.com/generative-computing/mellea/issues/929). Read the epic first if you haven't; it catalogues the specific threads this proposal tries to resolve coherently rather than individually.
+>
+> **Status:** proposal. Design docs produced during implementation will live under `docs/dev/`.
+>
+> **Structure:** **Part I** covers the problem, goals, terminology, end state, and the decisions that gate decomposition. **Part II** contains supporting detail — read after Part I is agreed, not before.
+>
+> **Terminology:** **"Adapter"** is the backend artefact: the weights loaded by a backend. The user-facing layer — helpers, input/output parsing, the AST component — is referred to as **`AdapterBasedComponent`** throughout this document. This is a placeholder name: IBM is retiring "Intrinsic" but has not yet confirmed the replacement; Mellea will adopt whatever name is settled upstream.
+>
+> **Related issues and prior work:** see the appendix at the end of this document for a linked index with annotations.
+
+---
+
+# Part I — Summary for agreement
+
+## Proposal at a glance
+
+**What changes:** three separate adapter classes (`IntrinsicAdapter`, `EmbeddedIntrinsicAdapter`, `CustomIntrinsicAdapter`) collapse into one `Adapter`:
+
+```
+Adapter = identity + io_contract + weights_binding
+```
+
+The `weights_binding` is pluggable — `LocalFileBinding`, `EmbeddedBinding`, or `ServerMediatedBinding` — each exposing the same four verbs (`prepare`, `activate`, `deactivate`, `release`). The backend calls these uniformly; it does not branch on adapter type.
+
+**What stays the same:** all high-level helpers (`check_answerability`, `requirement_check`, etc.) keep their current signatures. Deprecated classes are shimmed for one release.
+
+**Five decisions gate decomposition:**
+
+| # | Question | Status |
+| --- | --- | --- |
+| Q1 | Does the `Adapter = identity + io_contract + weights` shape hold? | **Resolved** (Jake): shape holds |
+| Q2 | Lifecycle default | **Resolved** (Jake): session-scoped loading; per-call auto-activate/deactivate |
+| Q3 | Reality C (server-mediated): design slot or leave empty? | **Resolved**: design slot, leave empty (Paul; vLLM blocked) |
+| Q4 | Deprecation window for old classes | **Resolved**: 1 minor release ≈ 4–6 weeks; longer if user impact warrants (Paul, Jake) |
+| Q5 | Terminology: name for the user-facing layer | **Resolved** (Jake): two-layer split adopted; user-facing layer named `AdapterBasedComponent` as placeholder pending IBM's final decision |
+
+Detail on each in §5.
+
+---
+
+## 1. The problem we are solving
+
+Mellea intrinsics — `check_answerability`, `requirement_check`, `find_citations`, the Guardian helpers — let users add specialised capabilities to a base model. Under the hood each one is an **adapter**: a small artefact that specialises the base model for that one task.
+
+Three sources of friction have accumulated:
+
+1. **Three different kinds of adapter share one class hierarchy.** Local PEFT adapters (weights on disk), Granite Switch "embedded" adapters (weights baked into the base model), and the yet-to-return OpenAI-compatible adapters (weights served behind an API) all try to live under one base class. The code branches on backend identity (`if backend._uses_embedded_adapters:`) to route between them.
+2. **Adapter lifecycle is not modelled.** `call_intrinsic` constructs an `IntrinsicAdapter` as a side effect of invoking one, which triggers an unconditional weight download even when no download is needed. The user sees a misleading download error; the real error is masked. There is no concept of "prepare," "activate," "deactivate" as distinct steps.
+3. **Small, visible follow-on issues cluster around these two roots** — a five-place model-options hierarchy with a silent-overwrite bug; JSON output keys hardcoded in helpers (`result_json["answerability"]`) that break when an adapter ships a new output schema; the `"requirement-check"` string duplicated across four files; a `CustomIntrinsicAdapter` whose constructor monkey-patches the global catalog with a self-confessed "temporary hack."
+
+Every thread in #929 is a symptom of not having separated the kinds of adapter and their lifecycles cleanly. This is not a theoretical concern: **seven fix-up commits have been merged in the adapter area in recent history** (full list in the appendix), alongside the `obtain_lora`-always-called masked error and the hardcoded `"requirement-check"` strings flagged by #929 point 7 / PR #1008 — the picture is of a subsystem that receives repeated small-scope fixes rather than a stable abstraction.
+
+## 2. What we are trying to achieve
+
+Four outcomes, in order of importance. Detail on each lives in Part II; this list is the ask.
+
+1. **One adapter model, one code path.** Reasonable from the outside, unified from the inside — no more `if backend._uses_embedded_adapters:` branches.
+2. **Safe evolution.** Model-option precedence is documented and enforced. Adapter weights are versioned by HF commit SHA — Mellea can pin to a specific revision for stability or track latest for newest weights (refresh policy in §17 Q5). Output schemas are stable in the common case (new weights, same schema); the rare breaking schema change is handled by pinning the HF revision and by helpers raising `AdapterSchemaMismatchError` when parse cannot yield the helper's declared output contract (Jake req 4). Forward-compatible additions (e.g. an extra optional field) do not trigger the error — only contract-breaking deltas do. Helpers like `check_answerability` see a normalised result regardless of underlying churn. Output-schema versioning beyond this is tracked in [#1111](https://github.com/generative-computing/mellea/issues/1111) (§17 Q4).
+3. **First-class customer adapters.** Customers can ship their own against the same API as first-party ones — today it requires patching the catalog or subclassing a self-confessed "temporary hack" ([#424](https://github.com/generative-computing/mellea/issues/424)).
+4. **Observable and parity-respecting.** Every lifecycle phase is a distinct span; high-level helpers (`check_answerability` etc.) keep their shape; manual adapter construction becomes simpler, not harder.
+
+## 3. Key terms (brief)
+
+Only the few terms needed to read Part I:
+
+- **AdapterBasedComponent** *(placeholder name)* — the user-facing capability: helper functions like `check_answerability`, the AST component, and input/output parsing. Implemented by an adapter. IBM is retiring "Intrinsic" and the replacement name is not yet confirmed; this document uses `AdapterBasedComponent` until that decision lands.
+- **Adapter** — the backend artefact: the weights loaded by a backend (LoRA / aLoRA / embedded), with its identity and I/O contract.
+- **Base model** — the general-purpose LLM everything runs on top of (e.g. `ibm-granite/granite-4.1-3b`).
+- **LoRA / aLoRA** — the two PEFT technologies adapters use. Both are supported.
+- **Reality A / B / C** — shorthand introduced in §4 for the three "where the weights live" stories.
+
+Full glossary (identity, I/O contract, weights binding, role, qualified name, catalog, io.yaml) is in §7 — needed only when you descend into the detail.
+
+## 4. Rough end result
+
+An **Adapter** is a small object composed of three parts:
+
+```
+Adapter
+├── identity — name, adapter type (lora/alora), optional role
+├── io_contract — parsed io.yaml: prompt building, output parsing, model options
+└── weights — one of three pluggable bindings (LocalFile, Embedded, ServerMediated)
+```
+
+**Sane defaults:** when an adapter's weights come from a HuggingFace repo, the `io_contract` defaults to the `io.yaml` in that same repo. Callers rarely pass `io_contract=` explicitly. Identity, I/O contract, and weights are tightly coupled by design; the defaults treat them as a unit.
+
+The **weights binding** is where the three realities live. It exposes a single verb set — `prepare`, `activate`, `deactivate`, `release` — that every backend calls uniformly. What each verb does per reality lives in §9; the high-level picture is all three realities converging on one shared `io_contract`:
+
+```mermaid
+flowchart LR
+ subgraph A["Reality A — Local PEFT"]
+ direction TB
+ A1["HF repo"] -->|"download"| A2["local cache"]
+ A2 -->|"PEFT load"| A3["base model
+ LoRA"]
+ end
+ subgraph B["Reality B — Embedded (Granite Switch)"]
+ direction TB
+ B1["base model
(weights baked in)"] -->|"render controls
in chat template"| B2["base model
(activated)"]
+ end
+ subgraph C["Reality C — Server-mediated"]
+ direction TB
+ C1["remote server
(holds weights)"] -->|"adapter_id
in API request"| C2["base model
(remote)"]
+ end
+```
+
+Adapter invocation becomes one flow, with no branches on backend type. From this shape, the seven threads of #929 resolve cleanly. The simplified invocation pseudocode, the per-binding verb semantics, the lifecycle sequence diagram, and the thread-by-thread mapping are in Part II (§9 and §12).
+
+**What users see:** high-level helpers (`check_answerability` etc.) keep their current shape, with the `model_options=` and `documents=` additions that fold in here from #1003 (PR #1028 was closed 2026-05-15 in favour of this epic). Manual adapter construction collapses from four classes to one, with the binding as the pluggable part. Custom intrinsics no longer require monkey-patching the catalog. Detail in Part II §13.
+
+**What cross-cutting concerns look like:** observability (spans + a schema-drift metric), docs rewrite (`intrinsics_and_adapters.md` is 39 lines describing classes this renames), and a test-parity commitment travel **with** the refactor, not after it. Detail in Part II §14–§15.
+
+### 4.1 Backend scope
+
+Of Mellea's five backends (`LocalHFBackend`, `OpenAIBackend`, `OllamaBackend`, `WatsonxBackend`, `LiteLLMBackend`), **the two primary adapter backends are `LocalHFBackend` and `OpenAIBackend`** — those are what this design targets. The remaining three are out of scope for adapter support because the underlying providers do not support the mechanisms Mellea's adapters need today. The `WeightsBinding` abstraction does not preclude adding them later. Full backend × reality matrix with per-backend reasoning is in §10.
+
+## 5. Decisions needed now
+
+These gate decomposition; everything else can live in sub-issues once these are agreed.
+
+1. **Does the end-state shape (§4) hold?** Three realities, `Adapter = identity + io_contract + weights`, role-based lookup for rerouting. **Resolved (Jake):** shape holds. In most cases identity / io_contract / weights will be colocated, but allowing divergence is the point — it enables separation of how weights are fetched from the adapter's functional definition.
+2. **Adapter lifecycle — session-scoped loading, per-call activate/deactivate.** **Resolved (Jake):** two-level lifecycle. Weight loading (`prepare`) is session-scoped — the adapter is loaded once and held until explicit `release()` at session teardown. Activation/deactivation (`activate`/`deactivate`) is call-scoped — auto-wrapped around each generation. This matches the §9.3 sequence diagram. The multi-tenancy concern is reduced because `LocalHFBackend` is primarily a single-user/local backend (see §10). Request-scoped lifecycle (including `prepare`/`release` per call) remains an opt-in for deployments that need per-call isolation.
+3. **Reality C target shape — design slot, leave empty.** **Resolved (Paul):** the aLoRA-on-vLLM path ([#27](https://github.com/generative-computing/mellea/issues/27)) is currently blocked — vLLM has declined to upstream aLoRA support (see §8.3 for history). The `ServerMediatedBinding` slot is designed so the interface is clean if the upstream situation ever changes, but the implementation stays empty and we don't invest in stubs.
+4. **Deprecation window — at least 1 minor release; longer if user impact warrants.** **Resolved (Paul, Jake):** Paul confirms 1 minor release ≈ 4–6 weeks is sufficient, extendable if needed; Jake notes the final length depends on how many users are impacted. **Sub-question (open):** can this ship without breakage at all? IBM is retiring "Intrinsic" (see Q5), so `IntrinsicAdapter` cannot stay as a permanent re-export — it will eventually need to go. The question is whether the deprecation shim for `IntrinsicAdapter` → `AdapterBasedComponent` (placeholder) can be deferred until the upstream name is confirmed, effectively separating the structural refactor from the naming change.
+5. **Terminology — two-layer split, with `AdapterBasedComponent` as placeholder.** **Resolved (Jake):** the conceptual split is agreed. "Adapter" is the backend artefact (weights loaded by a backend). The user-facing layer — helper functions, input/output parsing, the AST component — is a distinct abstraction. IBM is retiring the "Intrinsic" name but has not yet confirmed its replacement; until that decision lands, Mellea will use **`AdapterBasedComponent`** as the working placeholder name throughout the codebase and docs.
+
+ Three implementation sub-questions follow from this:
+ - **Q5a. Prose rename** — shift docs, error messages, and help text away from "Intrinsic" to `AdapterBasedComponent` (or the final IBM name once known). Zero breakage.
+ - **Q5b. Module rename** — rename `mellea.stdlib.components.intrinsic` → `mellea.stdlib.components.adapter_based_component` (placeholder path), with the old path re-exported for one release. Breaking for submodule importers.
+ - **Q5c. AST class rename** — rename `Intrinsic` AST component → `AdapterBasedComponent`, with `Intrinsic` as a deprecation alias for one release.
+
+> **Implementation note, not a reviewer question:** intrinsic-level observability (§14) should coordinate with the in-flight [#1035](https://github.com/generative-computing/mellea/issues/1035) / [PR #1036](https://github.com/generative-computing/mellea/pull/1036) work so content capture uses the same `MELLEA_TRACE_CONTENT` flag and doesn't get designed twice. Flagged here for awareness; sequenced during implementation.
+
+## 6. Impact and blast radius
+
+Scope of this refactor in concrete terms so reviewers can weigh the cost.
+
+### API surface
+
+- **Unchanged** — every high-level helper (`check_answerability` etc.) keeps its signature. `m.instruct`, `m.validate`, `m.chat` unaffected. The `model_options=` and `documents=` additions from [#1003](https://github.com/generative-computing/mellea/issues/1003) (PR #1028 closed 2026-05-15 in favour of this epic) ship as part of Phase 1.
+- **Deprecated but shimmed for one release** — `IntrinsicAdapter`, `EmbeddedIntrinsicAdapter`, `CustomIntrinsicAdapter` public classes. Direct users get `DeprecationWarning` pointing to the new constructor. *(Applies if Q5 settles on rename or new-API-alongside; under Jake's split, the old types stay as re-exports indefinitely with no deprecation needed.)*
+- **Optional, was mandatory** — the adapter catalogue. Callers no longer have to register custom adapters in `catalog.py` before use; the catalogue stays as a convenience resolver for first-party names, not a precondition.
+- **Possibly moved/renamed** — depends on §5 Q5 (terminology rename scope).
+
+### User-archetype impact
+
+| Audience | Impact |
+| --- | --- |
+| Helper user (`check_answerability`-style calls) | None beyond the `model_options=` / `documents=` additions from [#1003](https://github.com/generative-computing/mellea/issues/1003) and clearer error messages. |
+| Advanced user constructing adapters directly | One release of deprecation warnings, then adopt the new `Adapter(name=…, weights=…)` constructor. |
+| Customer writing their own adapter | First-class path; no more `CustomIntrinsicAdapter` monkey-patching; no forced catalogue upload. Resolves [#424](https://github.com/generative-computing/mellea/issues/424). |
+| Backend author | `AdapterMixin` verb set narrows to the natural operations each backend can perform; existing implementations update or use shim methods. |
+| Operator / SRE | New spans and metrics per §14; easier diagnosis of adapter failures and cost attribution. |
+
+### Code reach
+
+Files and modules touched, approximate: `mellea/backends/adapters/{adapter,catalog,__init__}.py`, `mellea/backends/{huggingface,openai}.py`, `mellea/stdlib/components/intrinsic/*`, `mellea/formatters/granite/intrinsics/*`, `mellea/stdlib/requirements/requirement.py`, `docs/examples/intrinsics/*`, `docs/dev/{intrinsics_and_adapters,requirement_aLoRA_rerouting}.md`. Larger than a typical feature PR; phased per §16 so individual PRs stay reviewable.
+
+### Release planning
+
+- **Target release (minor, exact number TBD)**: §5 agreement plus Phases 0–2 of the migration (new `Adapter` / `WeightsBinding` / `IOContract` types, call-site adoption, backend narrowing, deprecation shims for old classes, unified model-option precedence, observability per §14, tests per §15).
+- **Follow-on minor release**: [#1018](https://github.com/generative-computing/mellea/issues/1018) (embedded adapters on `LocalHFBackend`), Phase 4 shim removal.
+- **Deferred until upstream moves**: Reality C / [#27](https://github.com/generative-computing/mellea/issues/27).
+
+### Blocking and unblocking
+
+- **Blocks** [#1018](https://github.com/generative-computing/mellea/issues/1018) (explicitly stated in its issue body).
+- **Substantially addresses** [#423](https://github.com/generative-computing/mellea/issues/423) (adapter code undocumented and over-specialised), [#424](https://github.com/generative-computing/mellea/issues/424) (cannot use intrinsics without uploading), all seven threads of [#929](https://github.com/generative-computing/mellea/issues/929).
+- **Coordinates with** [PR #1036](https://github.com/generative-computing/mellea/pull/1036) on content-capture semantics.
+- **Blocked by** upstream vLLM position on aLoRA ([#27](https://github.com/generative-computing/mellea/issues/27)) — and only for Reality C. Parts I–II of this design are not gated on upstream.
+
+### Performance
+
+- **Likely neutral or improved.** Session-scoped lifecycle is the proposed default (matches current `LocalHFBackend` behaviour); no additional load/unload cost per call. Unified parsing avoids the double-parse that the current output-normalisation sometimes does.
+- **Regression watch**: if §5 Q2 chooses request-scoped, per-call PEFT load/unload becomes a visible cost. Measure before adoption.
+
+### Risk
+
+- **Biggest unknown**: whether the unified `resolve_model_options` handles every combination currently in use. Mitigation: keep the five-layer precedence explicit, add per-adapter override documentation, and assert resolved values in tests.
+- **Second biggest**: handling breaking schema changes from upstream. Two layers: pinning (avoid the risk), and helpers raising `AdapterSchemaMismatchError` when parse cannot yield the helper's declared output contract (Jake req 4, loud safety net). Forward-compatible additions do not trigger the error. Worked example: the [#1008](https://github.com/generative-computing/mellea/pull/1008) `requirement-check` change would have surfaced as `AdapterSchemaMismatchError` on the first call after the schema change, rather than silently returning `False`. (Output-schema versioning is tracked separately in [#1111](https://github.com/generative-computing/mellea/issues/1111) — §17 Q4.)
+- **Mitigated by**: per-phase test-parity commitment (nothing merges if existing tests regress); observability introduced alongside the refactor so production regressions surface as dashboard signals rather than silent behavioural drift.
+
+---
+
+# Part II — Supporting detail
+
+> For deeper review once Part I is agreed. Part II expands the definitions and the design so that reviewers can pressure-test the specifics. Sections are roughly ordered from "what exactly are we talking about" (terminology, realities, end-state detail) through "why this shape is right" (current tangle, thread mapping) to "what it looks like in practice" (user-facing, observability, docs/tests, migration, open questions).
+
+## 7. Terminology (full glossary)
+
+Names matter because they appear in user-facing error messages, docs, and telemetry attributes. The short list for quick reading is in Part I §3; this is the complete reference.
+
+| Term | Meaning |
+| --- | --- |
+| **Base model** | The general-purpose LLM (e.g. `ibm-granite/granite-4.1-3b`) that everything runs on top of. |
+| **AdapterBasedComponent** *(placeholder)* | The user-facing capability: helper functions (`check_answerability`, `requirement_check`, the Guardian helpers), the AST component, input/output parsing. Backed by an adapter. IBM is retiring "Intrinsic" and has not yet confirmed the replacement name; `AdapterBasedComponent` is used throughout this document as a placeholder (see Part I §5). |
+| **Adapter** | The backend artefact: the weights loaded by a backend (LoRA / aLoRA / embedded), with its identity, I/O contract, and weights binding. The user-facing **Intrinsic** wraps an adapter to provide helpers and parsing. In the redesign, the class hierarchy collapses from four (`IntrinsicAdapter` / `EmbeddedIntrinsicAdapter` / `CustomIntrinsicAdapter` + abstract base) to one `Adapter` + a pluggable binding. |
+| **Identity** | The part of an adapter that says *what it is*: name (e.g. `answerability`), adapter type (`lora` / `alora`), and optional role. |
+| **I/O contract** | The parsed `io.yaml` — prompt template, output parser, model-option defaults. Always present, same shape regardless of reality. *Name under discussion: Jacob prefers `io_config`; `io_contract` is used throughout this proposal but is not final.* |
+| **Weights binding** | The part of an adapter that says *how its weights are made available*. Three subclasses, one per reality. Exposes `prepare`, `activate`, `deactivate`, `release`. |
+| **Reality A / B / C** | Shorthand for the three "where the weights live" stories: A = local PEFT file, B = shipped with the base model (Granite Switch), C = server-mediated (future OpenAI/vLLM). |
+| **LoRA / aLoRA** | Two PEFT technologies. LoRA weights always participate; aLoRA only participates after an activation token is seen. A single adapter ships as one or the other (some intrinsics as either); both are supported across all three realities (including embedded — Granite Switch has LoRA and aLoRA adapters in the same repo, `technology` field on each). |
+| **Role** | A *semantic* label on an adapter distinct from its name — e.g. `requirement_check`, `context_attribution`. Used by callers (the `Requirement` rerouting path) to find "the adapter that plays this role" without hardcoding a name string. |
+| **Qualified name** | Today's disambiguator: `_`. In the redesign, derived on demand from `identity` rather than stored as a field. |
+| **Catalog** | The registry of known adapters at `mellea/backends/adapters/catalog.py`. Becomes optional and advisory rather than mandatory and monkey-patched. |
+| **`io.yaml`** | The YAML file that declares an adapter's input template, output schema, and generation parameters. Lives in the adapter's HuggingFace repo. |
+
+## 8. Three realities of "where the weights live"
+
+### 8.1 Reality A — Local PEFT adapter (today's `IntrinsicAdapter`)
+
+- Weights are a distinct file Mellea downloads from HuggingFace into the local cache.
+- At call time, the backend uses the PEFT library to plug those weights into the base model.
+- After the call, the backend can unplug them.
+- **Physical weights, runtime activation, downloadable lifecycle.**
+
+### 8.2 Reality B — Embedded adapter (today's `EmbeddedIntrinsicAdapter`, used by Granite Switch)
+
+- Adapter weights **ship in the same HuggingFace repo as the base model**. They come down with the base-model snapshot and are not fetched separately — confirmed by the fact that `EmbeddedIntrinsicAdapter.from_hub` downloads only `adapter_index.json` + `io_configs/**`, not weight files. The phrase "baked into the base model" is a useful shorthand but imprecise: the weights are still distinct PEFT modules, just co-located and pre-loaded by the inference runtime.
+- **Both LoRA and aLoRA are supported.** `adapter_index.json` lists each embedded adapter with a `technology` field (`"lora"` or `"alora"`). The chat template uses that field to place the `controls` JSON at the correct position — beginning of sequence for LoRA, before the generation prompt for aLoRA — so the right adapter is active for the right span of tokens. Granite Switch therefore genuinely carries both technologies; it is not a LoRA-only reality.
+- On the client side, only `io.yaml` is needed to format inputs and parse outputs.
+- **Pre-installed weights, prompt-level activation, no separate download lifecycle.**
+
+### 8.3 Reality C — Server-mediated adapter (partially gap today)
+
+The OpenAI-compatible backend **already supports adapters** — but only embedded ones (Granite Switch via Reality B, added in [PR #881](https://github.com/generative-computing/mellea/pull/881)). What's missing is *non-embedded* server-side adapters.
+
+**The history (corrected):** Mellea previously ran aLoRA adapters through the OpenAI backend against a **custom vLLM build** that carried an aLoRA patch. The upstream vLLM project declined to merge that patch (confirmed in [PR #543](https://github.com/generative-computing/mellea/pull/543)'s review: "the vLLM aLoRA PR will not [be] accepted, so the alora/intrinsics code for openai is now all dead code"), so PR #543 removed the dead path. Upstream vLLM has therefore **never carried** aLoRA support — the right framing is "declined upstream," not "dropped."
+
+**Current status (confirmed by Paul):** The aLoRA-on-vLLM path is blocked. vLLM has declined the upstream aLoRA patch, and there is no known path to change this. [Issue #27](https://github.com/generative-computing/mellea/issues/27) remains open to track any change in upstream position, but it is not a near-term delivery target. The design slot in this proposal exists as an interface commitment — if the upstream situation ever changes, here is the clean implementation path — not as an active work item.
+
+**Scope of this reality:** whatever the eventual technology path, the design slot is the same. Two sub-cases the binding must accommodate when the path becomes viable:
+
+- **C1 — Client-pulled, server-activated**: weights exist as a file client-side (or somewhere pullable), but activation happens on a remote inference server which loads them and exposes them via a LoRA ID or per-request model alias. This is the vLLM-shaped path, paced by #27 being unblocked.
+- **C2 — Provider-hosted**: weights live entirely on the provider's infrastructure. The client only ever passes an identifier. Applies to commercial fine-tunes behind OpenAI, Azure, etc. Not currently a known target in Mellea.
+
+Both share: **no local weight loading, API-parameter activation, `io.yaml` still required client-side.** The first concrete `ServerMediatedBinding` subclass sets the idiom for the API shape.
+
+**Intent summary for OpenAI-compatible support:** keep and extend. Embedded support stays. The design leaves a clean slot for C1 to be populated when #27 is unblocked upstream; C2 is noted for completeness but not a near-term target.
+
+## 9. End-state design detail
+
+### 9.1 Simplified invocation
+
+Adapter invocation collapses to a single flow with no branching on backend type:
+
+```
+adapter = backend.resolve_adapter(name)
+with backend.adapter_scope(adapter):
+ raw = backend.generate(adapter.io_contract.build_prompt(...))
+return adapter.io_contract.parse(raw)
+```
+
+Every verb that varies per reality lives inside `adapter_scope` (see §9.3); the outer flow is the same whether the adapter is a local PEFT file, an embedded Granite Switch adapter, or a server-mediated one.
+
+> **Boundary constraint:** `io_contract.build_prompt()` and `io_contract.parse()` must delegate to `granite-common` / `granite-formatters` for all `io.yaml` handling and parsing. The `IOContract` class in Mellea wraps these libraries; it does not re-implement their logic. (Jacob's requirement — keep `io.yaml` parsing in the granite-common / granite-formatters boundary.) `build_prompt()` returns a `Component`-compatible prompt object — not a raw string — consistent with the rest of Mellea's prompt pipeline.
+
+### 9.2 Weights binding verbs per reality
+
+Each concrete binding implements the four-verb set from Part I §4. The column meanings do not change between realities — only what happens inside the verb does.
+
+| Binding | `prepare` | `activate` | `deactivate` |
+| --- | --- | --- | --- |
+| `LocalFileBinding` (Reality A) | Download from repo → cache path | PEFT `load_adapter` | PEFT `unload_adapter` |
+| `EmbeddedBinding` (Reality B) | No-op (weights shipped with base model) | Render `controls` field into chat template | Drop the `controls` field |
+| `ServerMediatedBinding` (Reality C) | No-op (or push weights, depending on sub-case) | Set adapter identifier on API request | Unset identifier |
+
+`release()` is implemented per-binding as needed (cache eviction for LocalFile; no-op for the others).
+
+> **Which class knows an adapter doesn't need PEFT activation? The binding does — not the backend.** `EmbeddedBinding.activate()` renders `controls` JSON into the chat template; `LocalFileBinding.activate()` calls PEFT `load_adapter`. The backend calls `binding.activate()` uniformly and has no conditional on binding type. This is the mechanism that eliminates the `if getattr(backend, "_uses_embedded_adapters", False):` branch (§11). When embedded-adapter support is later added to `LocalHFBackend` ([#1018](https://github.com/generative-computing/mellea/issues/1018)), the backend does not need to learn about embedding — it calls the same verbs, and `EmbeddedBinding` handles the difference. The backend only needs the verb interface. (Addressing Jacob's review question on backend consumption.)
+
+> **Weight updates:** weights are versioned by HF commit SHA. `prepare()` resolves the configured revision (`main` by default, or a pinned SHA) and refreshes the local cache when upstream has moved. Refresh policy and the long-running-process exception are open (§17 Q5).
+
+### 9.3 Lifecycle sequence
+
+The lifecycle inside `adapter_scope` is the same for every binding — only the verbs do reality-specific work:
+
+```mermaid
+sequenceDiagram
+ participant C as Caller
+ participant B as Backend
+ participant A as Adapter
+ participant W as WeightsBinding
+ participant M as Base Model
+
+ C->>B: check_answerability(...)
+ B->>A: resolve_adapter(name)
+
+ rect rgb(245, 245, 245)
+ Note over B,W: adapter_scope(adapter)
+ B->>W: prepare()
+ W-->>M: download / no-op
+ B->>W: activate()
+ W-->>M: load / render controls / set adapter_id
+ B->>A: io_contract.build_prompt(...)
+ B->>M: generate(prompt)
+ M-->>B: raw output
+ B->>A: io_contract.parse(raw)
+ A-->>B: normalised result
+ B->>W: deactivate()
+ W-->>M: unload / drop controls / unset
+ end
+
+ B-->>C: score
+```
+
+## 10. Backend × reality matrix
+
+Mellea currently exposes five backends. Adapter support varies — and is not a goal for every backend.
+
+| Backend | Reality A (Local PEFT) | Reality B (Embedded) | Reality C (Server-mediated) | Notes |
+| ------------------- | :--------------------: | :------------------: | :-------------------------: | --- |
+| `LocalHFBackend` | ✅ today | ⏳🔽 [#1018](https://github.com/generative-computing/mellea/issues/1018) | — | Primary local backend; only one with aLoRA support today. Primarily used for individual/local deployments rather than multi-tenant environments (Paul). |
+| `OpenAIBackend` | — | ✅ today ([#881](https://github.com/generative-computing/mellea/pull/881)) | ⏳🔼 [#27](https://github.com/generative-computing/mellea/issues/27) | OpenAI-compatible endpoint, including vLLM servers. |
+| `OllamaBackend` | — | — | — | Ollama's LoRA/PEFT story is GGUF-based and immature; not a current target. |
+| `WatsonxBackend` | — | — | — | Would require watsonx-side adapter support; no current plan. |
+| `LiteLLMBackend` | — | — | — | Multi-provider shim; adapter support would depend on the underlying provider and is not a coherent single-backend target. Could opportunistically inherit C2 if any wrapped provider exposes fine-tuned identifiers. |
+
+Legend: ✅ supported today; ⏳🔽 planned, blocked by this proposal (downstream); ⏳🔼 planned, blocked by an upstream dependency outside Mellea; — not applicable or not planned.
+
+**What this says about intent:**
+
+- The two **primary adapter backends are `LocalHFBackend` and `OpenAIBackend`.** The refactor targets these first.
+- Granite Switch (embedded) is the newest addition but is **not** "the premier option": local PEFT via `LocalHFBackend` remains the development/on-prem path and is the only reality that ships with both LoRA and aLoRA today.
+- The remaining three backends (`OllamaBackend`, `WatsonxBackend`, `LiteLLMBackend`) are **out of scope for adapter support under this design**. The `WeightsBinding` abstraction does not preclude adding them later, but no issue currently tracks the intent and the underlying providers do not support the mechanisms Mellea's adapters need.
+- The design keeps every ✅ cell working, adds clean paths for the ⏳ cells without ad-hoc branching, and leaves empty cells empty rather than stubbing them speculatively.
+
+## 11. Why the current code is tangled (concrete example)
+
+Part I §1 listed the symptoms; this section names the *structural* cause. The single piece of code that most clearly shows it is the branch in `_util.call_intrinsic`:
+
+```python
+if getattr(backend, "_uses_embedded_adapters", False):
+ adapters = EmbeddedIntrinsicAdapter.from_source(...)
+else:
+ intrinsic_adapter = IntrinsicAdapter(...) # Reality A path
+```
+
+This is a **backend-keyed dispatch** where the branching key (`_uses_embedded_adapters`) is a property of the backend rather than of the adapter. Every new reality forces a new branch, and the `else` path is not a generic fallback — it is the Reality A path, so it unconditionally calls `obtain_lora` whether or not the adapter needs downloading. The three symptoms in §1 (misleading download errors, rigid output parsing, hardcoded role strings) are *all* consequences of this same shape: "the adapter doesn't know what kind it is, so the call site guesses." The new design flips this: the **binding** says what kind it is, and the backend simply executes its verbs.
+
+
+## 12. Full #929 thread mapping
+
+| Thread | Resolution |
+| --- | --- |
+| 1a. Loading/unloading divergence | One `WeightsBinding` verb set; control flow identical across realities. |
+| 1b. `obtain_lora` always-called bug | Only `LocalFileBinding.prepare` calls `obtain_lora`; others no-op. |
+| 1c. Backend- + adapter-type-specific abstraction | `WeightsBinding` is the adapter-type axis; `AdapterMixin` verbs are the backend axis. |
+| 2a. Intrinsic rewriters overwrite options | `Adapter.resolve_model_options()` replaces the five-place merge with one documented stack. |
+| 2b/2c. Model-option hierarchy | Five layers enforced in `resolve_model_options` (base model → adapter config → `io.yaml` defaults → `io.yaml` per-intrinsic → caller). |
+| 3. Naming consistency | Three-axis identity (`name`, `adapter_type`, `revision`) plus explicit `role`. |
+| 4a. `call_intrinsic` assumes one output schema | `io_contract.parse()` validates the output shape and raises `AdapterSchemaMismatchError` when parse cannot yield the declared contract (Jake req 4); forward-compatible additions do not trigger the error. Helpers see a normalised shape. |
+| 4b. Per-adapter vs standard schema | `io_contract.parse()` is per-adapter; helpers define the normalised post-parse shape. |
+| 4c. Versioning | HF commit SHA is the version (every push = new revision; pin via `revision="..."` for stability). Breaking schema changes (rare) handled by pinning and by helpers raising `AdapterSchemaMismatchError` when parse cannot yield the declared contract (Jake req 4). Output-schema versioning beyond this is tracked separately in [#1111](https://github.com/generative-computing/mellea/issues/1111) (§17 Q4). |
+| 5. OpenAI backend support | Ships as one or two `ServerMediatedBinding` subclasses. |
+| 6. Catalog cleanup | Catalog becomes optional resolver (`LocalFileBinding.from_catalog(name)`). Custom adapters bypass it; no monkey-patching. Duplicate `requirement_check` / `requirement-check` entries collapse into one entry; the v1 → v2 output-schema change (PR #1008) is handled by Jake req 4 (helper raises when parse cannot yield the declared contract); pinning the prior HF revision is the avoidance path. |
+| 7. Hardcoded `requirement-check` refs | Callers look up by **role**, not name. |
+
+## 13. What users see — detailed
+
+**High-level helpers** keep their signatures. The `model_options=` parameter (and `documents=` keyword on `factuality_detection` / `factuality_correction`) is added in Phase 1, folding in #1003 (PR #1028 closed in favour of this epic):
+
+```python
+score = check_answerability(question, documents, context, backend)
+score = check_answerability(question, documents, context, backend,
+ model_options={"temperature": 0.1})
+```
+
+**Validation on parse.** Helpers declare their expected output shape; `io_contract.parse()` validates against it and raises `AdapterSchemaMismatchError` when the parse cannot yield the helper's declared output contract — with `name`, observed keys, and expected keys in the message. Forward-compatible additions (an extra optional field the parser ignores) do not trigger the error; contract-breaking deltas (missing required field, type change on a depended-on key) do. Schema drift is loud, not silent. (Jake req 4.)
+
+**Manual adapter construction** collapses from four classes (`IntrinsicAdapter`, `EmbeddedIntrinsicAdapter`, `CustomIntrinsicAdapter`, abstract base) to one `Adapter` + a binding:
+
+```python
+# Adapter for the answerability intrinsic (auto-loaded from catalogue; pinned revision):
+adapter = Adapter(name="answerability",
+ weights=LocalFileBinding.from_catalog("answerability"))
+# Catalogue entry includes a pinned HF commit SHA (Jake req 5).
+# Pass revision="main" to LocalFileBinding directly to override and track latest.
+
+# Adapter for a custom intrinsic — io.yaml auto-loaded from the same HF repo:
+adapter = Adapter(name="my-thing",
+ weights=LocalFileBinding(source="myuser/my-adapter",
+ base_model_name="granite-4.1-3b"))
+# To override io_contract with a local file:
+# adapter = Adapter(..., io_contract=IOContract.from_yaml("./io.yaml"))
+
+# Adapter for the Granite Switch embedded variant:
+adapter = Adapter(name="answerability",
+ weights=EmbeddedBinding.from_base_model(backend))
+```
+
+**Backend authors** keep `AdapterMixin` as the backend surface, but it exposes only the verbs a backend naturally has: `load_peft_adapter`, `unload_peft_adapter`, `render_controls`, `set_request_adapter`. Bindings call into these verbs. Adding a new reality = adding a new verb + new binding.
+
+## 14. Observability
+
+### 14.1 Why adapters need bespoke observability
+
+Adapter calls hide the complexity that matters most when something goes wrong (weight fetching, activation side-effects, schema contracts). Without per-phase instrumentation, four failure modes are hard or impossible to diagnose — and Mellea has already hit the first two in production:
+
+1. **Masked errors.** The `obtain_lora`-always-called bug (#929 point 1b) showed users a misleading download error while the real cause (adapter-type mismatch) stayed invisible. A span at the `prepare` boundary recording the exception would have surfaced the actual cause on first run.
+2. **Silent schema drift.** When PR #1008 changed `requirement-check` output from `{"requirement_likelihood": 0.9}` to `{"requirement_check": {"score": 0.9}}`, `requirement_check_to_bool` silently returned `False` for every call until someone noticed. Under Jake req 4 (helpers raise when parse cannot yield the declared contract), this would have surfaced as `AdapterSchemaMismatchError` on the first call after the schema change — the caller gets a named error instead of a silently wrong value. The `parse_failures` counter labelled by `(name, revision)` is the dashboard signal; the exception is the runtime signal.
+3. **Latency attribution.** "`check_answerability` is slow" is unanswerable today — download, PEFT load, generation, and JSON parse collapse into one backend span. Phase-level spans make the culprit obvious in any trace viewer.
+4. **Alerting and cost attribution.** OTel `ERROR` status on failed download/activation makes generic dashboards and alerts work. Token counts labelled by adapter answer "which capability is 30% of our spend?" Both impossible today.
+
+Adding instrumentation now costs one span attribute per verb. Retrofitting after the refactor means re-editing every binding. And during a refactor this wide, the fastest way to spot a regression in a specific reality is a dashboard, not a bug report.
+
+### 14.2 Spans and metrics
+
+**Spans** — each `adapter_scope` wraps a child span tree rooted at `intrinsic.call`:
+
+```mermaid
+graph TD
+ root["intrinsic.call
name, revision, role,
binding_type, adapter_type"]
+ root --> prep["intrinsic.prepare
LocalFile: download ms"]
+ root --> act["intrinsic.activate
peft_name / controls / api_id"]
+ root --> gen["intrinsic.generate
(regular backend span:
tokens, latency)"]
+ root --> par["intrinsic.parse
revision, parse_ok, raw_len"]
+ root --> deact["intrinsic.deactivate"]
+```
+
+Standard attributes: `intrinsic.name`, `intrinsic.revision`, `intrinsic.role`, `intrinsic.adapter_type`, `intrinsic.binding_type`, `intrinsic.source`, `intrinsic.target`. Errors set OTel `ERROR` status (aligns with #1035 gap 4).
+
+**Metrics** — an `IntrinsicMetricsPlugin` alongside the existing Token / Latency / Error plugins:
+- `mellea.intrinsic.invocations` — counter labelled by name, revision, binding type, adapter type, outcome.
+- `mellea.intrinsic.phase_duration_ms` — histogram labelled by name, phase.
+- `mellea.intrinsic.parse_failures` — counter labelled by name, revision. This is the **schema-drift detector**: a climbing counter against a specific `(name, revision)` pair means an upstream adapter pushed a breaking schema change at a new HF revision that the local parser doesn't yet handle. Each increment matches an `AdapterSchemaMismatchError` raised at the call site (Jake req 4).
+
+**Content capture** — gated behind PR #1036's `MELLEA_TRACE_CONTENT` flag. Intrinsics emit `intrinsic.input.kwargs` (structured dict), `intrinsic.output.raw` (raw JSON string), and `intrinsic.output.parsed` (normalised shape) as span events. Different shape from chat `gen_ai.*.message` events because intrinsics have different semantics.
+
+## 15. Docs, tests, tutorials
+
+First-class deliverables, not afterthoughts.
+
+**Docs** — rewrite (not edit) for `docs/dev/intrinsics_and_adapters.md` (39 lines describing classes that get renamed). Update `docs/dev/requirement_aLoRA_rerouting.md` to describe role-based lookup instead of hardcoded strings. User-facing `docs/docs/advanced/intrinsics.md` and examples under `docs/examples/intrinsics/` are breaking-API touched. New dev doc for adapter observability. Update AGENTS.md §13 once normalised post-parse shapes are stable.
+
+**Tests** — existing adapter tests stay green per phase. New tests cover: each binding × each verb (unit); integration matrix `{HF, OpenAI} × {applicable bindings} × {lora, alora where applicable} × {every existing adapter}`; per-version parse round-trips (with `requirement-check` v1 / v2 as the worked case); concurrency window correctness; span/metric emission assertions.
+
+**Qualitative effectiveness suite (optional, per-adapter).** The tests above verify plumbing. They do *not* answer "does the answerability adapter actually judge answerability correctly?" A per-adapter qualitative suite (`@pytest.mark.qualitative`, opt-in, kept out of the fast loop) takes a small canonical dataset per adapter and asserts an accuracy floor on its outputs. Without this, a refactor can pass every structural test while silently degrading the behaviour users care about.
+
+The implementation approach for this suite is intentionally left open — start simple and file a separate proposal if a more structured approach (e.g. `TestBasedEval`, `BenchDrift`) is warranted.
+
+**Tutorials** — three worth writing alongside the refactor:
+- "Adding a custom intrinsic in 20 lines" — replaces the `CustomIntrinsicAdapter` monkey-patch story.
+- "Handling a breaking schema change without breaking users" — worked example using `requirement-check` v1 → v2; covers HF revision pinning and `AdapterSchemaMismatchError` (Jake req 4).
+- "Reading intrinsic telemetry" — short dashboard-building guide.
+
+**Release notes** separate: no-op for high-level helper users; deprecated-but-shimmed for direct adapter constructors; removed at Phase 4 (see below).
+
+## 16. Migration (rough shape only)
+
+Detail deferred until Part I §5 decisions are agreed, but the intended phasing is:
+
+1. **Phase 0 — parallel types.** Introduce the new types (`Adapter`, `WeightsBinding`, `IOContract`, plus a user-facing `Intrinsic` class if Q5 is settled on Jake's split) alongside existing classes. Catalogue entries gain pinned HF revision SHAs (Jake req 5; §17 Q6). No call-site changes, tests unchanged.
+2. **Phase 1 — callers move.** `_util.call_intrinsic`, requirement rerouting, and each helper switch to new types. Helpers gain output validation raising `AdapterSchemaMismatchError` when parse cannot yield the declared contract (Jake req 4). #1003 helper signature work folds in here: `model_options=` on all top-level helpers; `documents=` keyword-only on `factuality_detection` / `factuality_correction`. Auto-context document discovery for `documents=None` lifted from PR #1028 (§17 Q3); mechanism refined per intrinsics-team guidance — helpers read documents from ordinary conversation context, not from a `_docs`-specific scan path. Old classes become deprecation shims.
+3. **Phase 2 — backends move.** `AdapterMixin` narrows to the new verb set. Bindings implement `prepare` / `activate` / `deactivate` / `release` per reality; `LocalFileBinding.prepare` resolves the configured HF revision (§17 Q5 weight-refresh policy). Backends drop per-call `_simplify_and_merge` in favour of `resolve_model_options`.
+4. **Phase 3 — Reality C ships.** `ServerMediatedBinding` subclass(es) written; OpenAI backend drops `_uses_embedded_adapters` hard-code.
+5. **Phase 4 — shim removal.** After one minor release with deprecation warnings. *(Skipped if Q5 settles on Jake's split — re-exports stay.)*
+
+Observability and docs deliverables attach to the phase that first exercises them.
+
+## 17. Open questions and implementation positions
+
+Items marked **[Open]** need decision; **[Position]** is the proposal's working answer (reviewers can push back); **[Resolved]** has explicit reviewer agreement. Decisions that gate decomposition are in Part I §5; this section is for implementation-level questions and positions.
+
+1. **Naming `WeightsBinding`** [Position]. Used throughout the doc; alternatives `ResourceStrategy` / `AdapterProvider` were considered. `WeightsBinding` is concrete (says what it binds) and unambiguous in error messages.
+2. **Role vs name** [Position]. `role` is a free-form string with an advisory known-roles registry (e.g., `mellea.backends.adapters.roles.KNOWN_ROLES`). Backends warn on unknown roles but accept any string. Pure enum was considered but rejected — it would lock role names at library-release time.
+3. **Rewind interaction (formerly PR #1028).** Two parts:
+ - **Where rewind logic lives** [Resolved] (Jake on PR #1080): the rewind in `_resolve_question` / `_resolve_response` stays in the helpers. Phase 1 can revisit moving it to `io_contract.build_prompt` if cleaner separation is wanted; not gating.
+ - **Document discovery when `documents=None`** [Resolved] (Jake on PR #1080, 2026-05-20). When `documents=None` is passed to `factuality_detection` / `factuality_correction`, the helper auto-discovers user-supplied documents from the conversation context rather than requiring an explicit `documents=` argument. The auto-discovery direction was the contribution of PR #1028; intrinsics-team guidance refines the *mechanism*: documents flow through ordinary conversation context (no `_docs`-scanning fallback path needed). Phase 1 implements helpers that read whatever documents are present in the context they receive; populating that context is the caller's responsibility (explicit `documents=`, prior `Message`s, retrieval, …). PR #1028's specific `_resolve_response` `_docs`-scanning code is shelved.
+4. **Output-schema versioning** [Resolved — Defer to [#1111](https://github.com/generative-computing/mellea/issues/1111)]. This refactor assumes `io.yaml` does **not** carry a `schema_version` field, and Mellea does not introduce one. Forward-compatibility is preserved: helpers only raise `AdapterSchemaMismatchError` when parse cannot yield the declared contract, not on benign additions. Versioning is tracked separately in #1111; promote to in-progress when the trigger conditions documented there are hit.
+5. **Weight-refresh policy** [Position]. Adapter weights are versioned by HF commit SHA. `prepare()` re-resolves the upstream revision at session start; long-running processes (sessions spanning a release) opt into an explicit `refresh()` API. Default cadence matches the session-scoped lifecycle (Part I §5 Q2 Resolved).
+6. **Version pinning for auto-loaded adapters** [Position]. When an adapter is auto-loaded from the catalogue (caller didn't specify a revision), Mellea pins to the catalogue entry's recorded SHA. `revision="main"` is an explicit opt-in to track latest. Pinning gives reproducibility; explicit tracking gives latest weights at the cost of behaviour drift between runs. (Jake req 5; coupled to Q5 weight-refresh policy.)
+
+---
+
+# Appendix — Referenced issues and PRs
+
+Linked index of every issue, PR, and commit cited in this document. Use this to jump to primary sources.
+
+### Tracking items (open, design-relevant)
+
+| Ref | Title | Relevance |
+| --- | --- | --- |
+| [Epic #929](https://github.com/generative-computing/mellea/issues/929) | Fix Intrinsic Adapter Lifecycle & Consistency in Mellea | *the epic this proposal addresses* |
+| [#27](https://github.com/generative-computing/mellea/issues/27) | Add support for aloras to remote vllm when vllm supports it | live tracking item for Reality C |
+| [#423](https://github.com/generative-computing/mellea/issues/423) | Adapter code is undocumented and over-specialized to Intrinsics | Priority-labelled; overlaps this refactor |
+| [#424](https://github.com/generative-computing/mellea/issues/424) | Cannot use intrinsics without uploading them | customer-adapter friction |
+| [#1018](https://github.com/generative-computing/mellea/issues/1018) | add support for granite-switch / embedded adapters on HF backend | explicitly sequenced after this refactor |
+
+### History and rework evidence
+
+| Ref | Title | Role in this doc |
+| --- | --- | --- |
+| [#543](https://github.com/generative-computing/mellea/pull/543) | revert: remove adapters/intrinsics/alora/lora from openai code | why OpenAI backend lost adapter support (upstream vLLM declined aLoRA PR) |
+| [#881](https://github.com/generative-computing/mellea/pull/881) | feat: add embedded adapters (granite switch) to openai backend | why OpenAI backend got Reality B back |
+| [#946](https://github.com/generative-computing/mellea/pull/946) | feat: simplify intrinsics | rework evidence |
+| [#972](https://github.com/generative-computing/mellea/pull/972) | fix: model options with intrinsics | rework evidence for #929 point 2 |
+| [#979](https://github.com/generative-computing/mellea/pull/979) | fix: key in json returned by policy_guardrails intrinsic | rework evidence for output parsing |
+| [#986](https://github.com/generative-computing/mellea/pull/986) | fix: issues introduced by intrinsic changes | rework evidence |
+| [#994](https://github.com/generative-computing/mellea/pull/994) | fix: default intrinsic adapter types; granite-switch tests | rework evidence |
+| [#1008](https://github.com/generative-computing/mellea/pull/1008) | fix: rewrite requirement_check_to_bool for new schema | worked example for the contract-mismatch story (Jake req 4) |
+| [#1028](https://github.com/generative-computing/mellea/pull/1028) | feat: normalize intrinsics interfaces | introduces the factuality rewind path |
+
+#### Rework evidence in detail
+
+Seven recent fix-up commits in the adapter area, all symptomatic of the design gaps described in §1 rather than straightforward feature work. Referenced from §1 as evidence that this is friction, not theory:
+
+| Commit / PR | What it fixed |
+| --- | --- |
+| `1734900d` | Remove `answer_relevance*` intrinsics and unrelated intrinsic issues. |
+| `8b6b8d55` ([#972](https://github.com/generative-computing/mellea/pull/972)) | Model options with intrinsics (precedence bug surfaced). |
+| `c57aba1d` ([#986](https://github.com/generative-computing/mellea/pull/986)) | Issues introduced by preceding intrinsic changes. |
+| `8577d092` ([#994](https://github.com/generative-computing/mellea/pull/994)) | Default intrinsic adapter types; canned I/O with temperature. |
+| `4d372b0e` ([#979](https://github.com/generative-computing/mellea/pull/979)) | Key in JSON returned by `policy_guardrails` intrinsic. |
+| `0617bd96` ([#1008](https://github.com/generative-computing/mellea/pull/1008)) | Rewrote `requirement_check_to_bool` for a changed output schema; flipped `"requirement_check"` → `"requirement-check"` in four files. |
+| `75465d29` ([#946](https://github.com/generative-computing/mellea/pull/946)) | "Simplify intrinsics" — reacting to accumulated complexity. |
+
+### Related in-flight and planned work
+
+| Ref | Title | Role in this doc |
+| --- | --- | --- |
+| [#1003](https://github.com/generative-computing/mellea/issues/1003) | fix: intrinsic function signatures | folded into Phase 1 of this epic; PR #1028 closed 2026-05-15 |
+| [PR #1028](https://github.com/generative-computing/mellea/pull/1028) | feat: normalize intrinsics interfaces | closed 2026-05-15 in favour of folding into this epic. Two threads inherited: (1) #1003 helper signatures → Phase 1 (already scoped); (2) auto-context document discovery → Phase 1 (§17 Q3 Resolved 2026-05-20; mechanism refined to ordinary-context reading, not `_docs` scanning). |
+| [#1035](https://github.com/generative-computing/mellea/issues/1035) | OTel emission gaps | parent for telemetry coordination |
+| [PR #1036](https://github.com/generative-computing/mellea/pull/1036) | feat(telemetry): close five OTel GenAI semconv gaps | in-flight telemetry work to coordinate with |
+
+### Sequencing
+
+**Why [#1018](https://github.com/generative-computing/mellea/issues/1018) waits for this proposal:**
+
+- #1018's own body states: *"May require sorting out some of the issues in #929 first. Or at least creating a comprehensive plan."*
+- Once Part I is agreed and Phase 0–2 of the migration have merged, #1018 reduces to *"add the `EmbeddedBinding` path to `LocalHFBackend`"* following the pattern already used for `OpenAIBackend`.
+- Attempting #1018 without this refactor re-creates the same branching problem on a second backend.
+
+### Verification trail
+
+Verified against: `mellea/backends/adapters/{adapter,catalog,__init__}.py`, `mellea/stdlib/components/intrinsic/{_util,intrinsic,core,rag,guardian}.py`, `mellea/backends/{openai,huggingface}.py`, `mellea/formatters/granite/intrinsics/input.py`, `mellea/stdlib/requirements/requirement.py`, `docs/dev/{intrinsics_and_adapters,requirement_aLoRA_rerouting}.md`; commits `666d646a`, `8b6b8d55`, `c57aba1d`, `8577d092`, `c6a3e643` (aLoRA → PEFT 0.18.1 migration).
diff --git a/docs/dev/proposals/929-issue-plan.md b/docs/dev/proposals/929-issue-plan.md
new file mode 100644
index 000000000..2403671ba
--- /dev/null
+++ b/docs/dev/proposals/929-issue-plan.md
@@ -0,0 +1,1275 @@
+# Epic #929 — Issue Breakdown Plan
+
+> **Companion to:** [929-adapter-lifecycle.md](./929-adapter-lifecycle.md) (the design proposal)
+> **Parent epic:** [#929](https://github.com/generative-computing/mellea/issues/929)
+> **PR:** [#1080](https://github.com/generative-computing/mellea/pull/1080)
+> **Status:** review draft — issues to be filed after final approval
+>
+> This document combines the decomposition rationale, dependency diagram,
+> filing plan, and full issue bodies for all 11 sub-issues. After issues
+> are filed and PR #1080 closes, this becomes a historical artefact.
+
+---
+
+# Part 1 — Overview
+
+**Parent:** [#929](https://github.com/generative-computing/mellea/issues/929) — Fix Intrinsic Adapter Lifecycle & Consistency in Mellea
+**Design proposal:** PR [#1080](https://github.com/generative-computing/mellea/pull/1080)
+**Detailed breakdown:** [`proposed-issues.v3.md`](./proposed-issues.v3.md) (full issue bodies, acceptance criteria, test plans)
+**Date:** 2026-05-21
+
+---
+
+## Background and rationale
+
+The adapter/intrinsic area has produced seven fix-up commits in a short period, three silent bugs, and blocked two features (#1018, #27). The root cause is a single structural problem: code that should know what *kind* of adapter it is (local PEFT file, Granite Switch embedded, server-mediated) spreads that decision across every call site via `isinstance` branching. Adding a new backend reality forces a new branch everywhere.
+
+The refactor replaces a four-class hierarchy (`IntrinsicAdapter`, `EmbeddedIntrinsicAdapter`, `CustomIntrinsicAdapter`, abstract `Adapter`) with a single composable type:
+
+```
+Adapter
+├── identity — name, adapter_type (lora|alora), optional role
+├── io_contract — wraps granite-common / granite-formatters; builds prompt, parses output
+└── weights — pluggable WeightsBinding: LocalFile / Embedded / ServerMediated
+```
+
+The binding says what kind of adapter it is. The backend executes four verbs (`prepare`, `activate`, `deactivate`, `release`) uniformly — no branching.
+
+### Why decompose at all?
+
+A single "refactor everything" PR would be: unreviable (thousands of LOC), all-or-nothing (any regression blocks the whole change), impossible to parallelise, and catastrophic to rebase if upstream keeps moving. The decomposition ensures:
+
+- **Each PR is reviewable** — scoped to one file, one abstraction surface, or one coherent concern. Reviewers see a clear before/after, not a diff that touches 30 files.
+- **Breakage is isolated** — each phase keeps existing tests green; a regression in one PR does not block all other work.
+- **Work can be parallelised** — after the two bottleneck PRs (1.A and 2.1) merge, three to four independent PRs can proceed simultaneously.
+- **Stacked PRs are minimised** — the only true stacks are 0.1→1.A and 1.A→2.1. Everything else in each wave is independent.
+- **Incremental user-visible improvement** — each phase delivers working, shippable code; the epic does not need to land as a big-bang release.
+
+---
+
+## Decomposition shape — 11 issues
+
+| Issue | Phase | Title | Depends on | Blocks |
+|---|---|---|---|---|
+| **0.1** | 0 | Introduce `Adapter` / `Identity` / `IOContract` / `WeightsBinding` scaffolding (+ `AdapterBasedComponent` placeholder, `KNOWN_ROLES`) | — | 1.A, 2.1, 2.2, 2.3 |
+| **0.2** | 0 | Pin catalogue entries to HF revision SHAs (+ deduplicate requirement_check entries) | — | 2.2 |
+| **1.A** | 1 | Internal migration with shims — old classes inherit from new `Adapter`; `call_intrinsic` rewritten | 0.1 | 1.B, 1.C, 1.D, 2.1 |
+| **1.B** | 1 | RAG helpers (`rag.py` whole-file, 6 helpers) migrate to new types | 1.A | — |
+| **1.C** | 1 | `requirement_check` / `requirement_check_to_bool` migrate to new types; schema-mismatch becomes loud | 1.A | — |
+| **1.D** | 1 | `guardian.py` whole-file migration (4 helpers + `documents=` keyword-only + auto-context discovery) | 1.A | — |
+| **2.1** | 2 | `AdapterMixin` verb rename/narrow + `resolve_model_options` centralisation + `IntrinsicMetricsPlugin` | 1.A | 2.2, 2.3 |
+| **2.2** | 2 | `LocalFileBinding` implements verbs (PEFT/aLoRA path) + `from_catalog()` + spans + integration tests | 0.1, 0.2, 2.1 | 4.1 |
+| **2.3** | 2 | `EmbeddedBinding` implements verbs (Granite Switch path) + spans | 0.1, 2.1 | 4.1 |
+| **3.1** | 3 | `ServerMediatedBinding` implementation — **blocked on upstream #27 (vLLM aLoRA)** | Phase 2 + #27 | — |
+| **4.1** | 4 | Remove deprecation shims; rewrite `docs/dev/intrinsics_and_adapters.md`; write 3 tutorials | 1.A + Phase 2 + 1 minor release | — |
+| **[#1111](https://github.com/generative-computing/mellea/issues/1111)** | cross | Output-schema versioning (already filed, deferred) | 0.1 (needs `AdapterSchemaMismatchError`) + trigger conditions | — |
+
+---
+
+## Dependency diagram
+
+```mermaid
+graph TD
+ A["0.1 — New types scaffolding\n(Adapter/Identity/IOContract/WeightsBinding\nAdapterBasedComponent placeholder\nKNOWN_ROLES registry)"]
+ B["0.2 — Catalogue HF revision pinning\n(+ deduplicate requirement_check entries)"]
+ C["1.A — Internal migration + shims\n(call_intrinsic rewritten\nold classes inherit from new Adapter\nadapter_scope hook point)"]
+ D["1.B — rag.py whole-file migration\n(6 helpers + model_options=\n+ output validation)"]
+ E["1.C — requirement_check migration\n(loud schema-mismatch\nbreaking: no more silent False)"]
+ F["1.D — guardian.py whole-file migration\n(4 helpers + documents= kw-only\n+ auto-context discovery)"]
+ G["2.1 — AdapterMixin verb rename/narrow\n(+ resolve_model_options\n+ IntrinsicMetricsPlugin\n+ adapter_observability.md)"]
+ H["2.2 — LocalFileBinding implements verbs\n(PEFT/aLoRA path\n+ from_catalog() classmethod\n+ spans + integration tests\n+ docs/examples update)"]
+ I["2.3 — EmbeddedBinding implements verbs\n(Granite Switch path\n+ spans\n+ AGENTS.md update)"]
+ J["3.1 — ServerMediatedBinding\n⛔ blocked on upstream #27"]
+ K["4.1 — Remove shims\n+ rewrite intrinsics_and_adapters.md\n+ 3 tutorials\n(after 1 minor release)"]
+ L["#1111 — Output-schema versioning\n(already filed, deferred)"]
+
+ A --> C
+ A --> G
+ A --> H
+ A --> I
+ A --> L
+
+ B --> H
+
+ C --> D
+ C --> E
+ C --> F
+ C --> G
+ C --> K
+
+ G --> H
+ G --> I
+
+ H --> K
+ I --> K
+
+ I --> J
+ H --> J
+
+ style J fill:#ffcccc,stroke:#cc0000
+ style K fill:#ffe0b2,stroke:#e65100
+ style L fill:#e8f5e9,stroke:#388e3c
+```
+
+---
+
+## Filing waves
+
+For a single developer, the waves define the serialisation constraint. Within each wave, issues are independent — the first to get a PR open does not block the others.
+
+**Wave 1** (start immediately, parallel):
+→ 0.1, 0.2 — both independent; no shared files.
+
+**Wave 2** (after 0.1 has a draft PR):
+→ 1.A — the only Phase 1 bottleneck. Kept deliberately narrow (~300–500 LOC) so it merges quickly.
+
+**Wave 3** (after 1.A merges — four issues run fully in parallel):
+→ 1.B, 1.C, 1.D (independent helper files), and 2.1 (backend mixin) in parallel.
+
+**Wave 4** (after 2.1 merges):
+→ 2.2, 2.3 — independent bindings for the two active realities.
+
+**Tracking / deferred** (file any time, mark blocked):
+→ 3.1 (blocked on #27), 4.1 (deferred one minor release after shims introduced in 1.A).
+
+The two bottlenecks (1.A, 2.1) are the only serialisation points. Kept small by design.
+
+---
+
+## Pre-flight — open issues and PRs to resolve first
+
+| Item | Overlap | Action |
+|---|---|---|
+| **PR #935** — Guardian docs migration | Touches same docs as 1.D/2.2/2.3 | Merge before starting 1.D |
+| **PR #1078** — intrinsic tests / safeguards (fixes #1029) | Adds formatter test data that 0.2/1.C build on | Merge before starting 0.2/1.C; close #1029 |
+| **#1094** — Migrate session example off deprecated GuardianCheck | Same file as 1.D scope | Close with 1.D (include in 1.D scope) |
+| **#1071** — Guardian-backed Requirement subclass (new feature) | Adds guardian code in old style | Do not start before 1.D merges |
+
+---
+
+## What each phase delivers
+
+| Phase | Delivered | Breakage risk |
+|---|---|---|
+| 0 | New type vocabulary; catalogue with pinned SHAs; `KNOWN_ROLES` advisory registry | None for users — purely additive |
+| 1 | All helpers on new types; `model_options=` on all helpers; loud schema-mismatch errors; deprecation warnings on old construction | `requirement_check_to_bool` stops returning silent `False` (intentional breaking change; flagged in changelog) |
+| 2 | Fully working `LocalFileBinding` and `EmbeddedBinding`; observability spans + metrics; `from_catalog()` API; backend mixin cleaned up | `AdapterMixin` verb rename (downstream backends must update; migration table in changelog) |
+| 3 | `ServerMediatedBinding` (when unblocked) | Internal to OpenAI backend |
+| 4 | Shims removed; full docs rewrite; three tutorials | Old class imports break (by design, after deprecation window) |
+
+---
+
+## Cross-cutting deliverables (not phase-specific)
+
+Every issue includes its own tests. The end state across all phases also delivers:
+
+- **Telemetry:** `IntrinsicMetricsPlugin` (2.1) + per-verb OTel spans `intrinsic.call / prepare / activate / parse / deactivate` (2.2, 2.3). `parse_failures` counter is the schema-drift detector — a climbing count against `(name, revision)` means an upstream adapter pushed a breaking schema change. Content capture via `MELLEA_TRACE_CONTENT` gate (consistent with #1035 / PR #1036).
+- **Docs:** `docs/dev/requirement_aLoRA_rerouting.md` (1.A), `docs/docs/advanced/intrinsics.md` (2.2, 2.3), `docs/dev/adapter_observability.md` (2.1+), `AGENTS.md §13` (2.3), full rewrite of `docs/dev/intrinsics_and_adapters.md` (4.1), three tutorials (4.1).
+- **Examples:** `docs/examples/intrinsics/` updated at every helper-migration PR (1.B, 1.C, 1.D) and the new construction pattern added at 2.2.
+- **Output-schema versioning:** tracked separately in [#1111](https://github.com/generative-computing/mellea/issues/1111); unblocked when `AdapterSchemaMismatchError` exists (0.1).
+
+---
+
+*Detailed issue bodies (problem, agreed design, scope, out of scope, acceptance criteria, test plan, risks, breaking changes, impinging issues, references) are in [`proposed-issues.v3.md`](./proposed-issues.v3.md) in this directory.*
+
+---
+
+# Part 2 — Full issue bodies
+
+# Per-helper-file migration template
+
+Issues 1.B, 1.C, 1.D share the same shape. Each issue body says "follows the per-helper-file migration template" plus the file-specific deltas.
+
+### Common shape per migrated file
+
+Each per-file PR does four things, in this order, on a single helper file:
+
+1. **Migrate construction.** Replace internal use of `IntrinsicAdapter(...)` / `EmbeddedIntrinsicAdapter(...)` with direct construction of `Adapter(identity=..., io_contract=..., weights=...)` from the new types introduced in 0.1.
+2. **Normalise signature.** Add `model_options: dict | None = None` as a keyword argument to every helper in the file. (File-specific: 1.D's factuality helpers also add `documents=` keyword-only — see 1.D.)
+3. **Add output validation (Jake req 4).** Declare each helper's expected output contract; wire `io_contract.parse()` to raise `AdapterSchemaMismatchError` when parse cannot yield that contract. Forward-compatible additions (extra optional fields the parser ignores) do NOT raise — only contract-breaking deltas (missing required field, type change on a depended-on key) do.
+4. **Update docs and examples.** The Phase 1 per-file PRs are when helpers gain new parameters and contracts; docs and examples must ship with the code, not after. Each per-file PR is responsible for: (a) updating the docstring for every helper it touches with the declared output contract and any new parameters; (b) updating `docs/examples/intrinsics/` examples that call helpers in this file; (c) adding a brief note to the PR description pointing at any user-facing docs page that needs a follow-up rewrite in Phase 2.
+
+### Common acceptance criteria
+
+- [ ] All helpers in the file construct their `Adapter` using the new types (no `IntrinsicAdapter(...)` calls in helper code)
+- [ ] All helpers accept `model_options: dict | None = None` (file-specific extras as noted in each issue)
+- [ ] Each helper's output is validated against a declared contract; `AdapterSchemaMismatchError` raised on contract-break, NOT on benign additions
+- [ ] Existing helper tests pass (behavioural neutrality is the bar)
+- [ ] New tests cover: (a) declared contract enforced — feed a synthetic output missing a required field, assert the error; (b) forward-compat — feed an output with an extra optional field, assert it does NOT raise
+- [ ] Docstrings updated: every helper in the file documents its declared output contract and any new parameters
+- [ ] `docs/examples/intrinsics/` examples that call helpers in this file updated to use `model_options=` where applicable; examples pass `uv run pytest docs/examples/intrinsics/` (or skipped with correct marker if backend-gated)
+- [ ] `ruff format`, `ruff check`, `mypy` clean
+- [ ] DeprecationWarning suppression: callers can still construct `IntrinsicAdapter(...)` (shim from 1.A) but emit a `DeprecationWarning` pointing at the new construction pattern
+
+### Common test plan
+
+- Existing happy-path tests pass unchanged
+- New test: `AdapterSchemaMismatchError` raised on synthetic missing-field output (per helper)
+- New test: forward-compatible addition (extra optional field) does NOT raise (per helper)
+- Docstring spot-check: `help(check_answerability)` (or equivalent for this file's helpers) shows the declared contract and new parameters
+
+### Why per-file PRs
+
+Each PR is scoped to a single helper file (one coherent concern: "this file's helpers are now on the new types"). They run in parallel after 1.A merges, share no files, and the first one to merge sets the pattern reviewers can lean on for the rest.
+
+---
+
+# Phase 0 — parallel types
+
+## 0.1 — Introduce `Adapter` / `Identity` / `IOContract` / `WeightsBinding` scaffolding (+ `AdapterBasedComponent` placeholder)
+
+**Parent:** #929 · **Blocks:** 1.A, 2.1, 2.2, 2.3 · **Phase:** 0
+
+### Problem
+
+Today's adapter hierarchy is `IntrinsicAdapter` / `EmbeddedIntrinsicAdapter` / `CustomIntrinsicAdapter` plus an abstract base class `Adapter` (in `mellea/backends/adapters/adapter.py`). The split is by *where the weights live* (local PEFT file vs embedded in the base model vs server-mediated), but that distinction leaks into every caller as `isinstance` branching: `_util.call_intrinsic`, requirement rerouting, every helper, and the backends themselves all have separate code paths per subclass.
+
+This branchy structure was the root cause of seven recent fix-up commits in the adapter area (`8b6b8d55`, `c57aba1d`, `8577d092`, `4d372b0e`, `0617bd96`, `75465d29`, `1734900d`). It also blocks adding new realities cleanly — see #1018 (granite-switch on HF backend).
+
+Separately, IBM is retiring the term "Intrinsic" but has not confirmed the replacement name. Mellea agreed to use **`AdapterBasedComponent`** as a placeholder until that decision lands upstream.
+
+### Agreed design
+
+Replace the four-class hierarchy with a single `Adapter` composed of three parts plus a pluggable weights binding:
+
+```
+Adapter (new shape)
+├── identity — name, adapter_type (lora|alora), optional role
+├── io_contract — input/output handling; wraps granite-common / granite-formatters
+└── weights — pluggable WeightsBinding subclass (LocalFile / Embedded / ServerMediated)
+```
+
+**Naming-collision note (critical for implementer):** the existing `Adapter` ABC at `mellea/backends/adapters/adapter.py:24` is the same name as the proposed new type. The new type is introduced under a new module (`_core.py` or equivalent) and re-exported. Until 4.1 deletes the old shims, the old `Adapter` ABC and the new `Adapter` dataclass coexist. Implementer may either (a) introduce the new type under a different module path and alias as `Adapter` in the public surface, or (b) move the old ABC to a private name. Document the choice in the PR description.
+
+**Types to introduce:**
+
+- `Adapter` — dataclass holding `identity`, `io_contract`, `weights`
+- `Identity` — dataclass holding `name: str`, `adapter_type: Literal["lora", "alora"]`, `role: str | None = None`
+- `IOContract` — ABC with two methods:
+ - `build_prompt(...) -> Component` — builds the prompt object (must return a `Component`-compatible object, not a raw string). Delegates `io.yaml` handling to granite-common / granite-formatters; does not re-implement that logic.
+ - `parse(raw: str) -> dict` — parses adapter output. Raises `AdapterSchemaMismatchError` only when parse cannot yield the helper's declared output contract. Forward-compatible additions do NOT raise.
+- `WeightsBinding` — ABC with four verbs (all abstract):
+ - `prepare(self) -> None` — fetches/loads weights
+ - `activate(self, ctx) -> None` — switches the adapter on for a generation
+ - `deactivate(self, ctx) -> None` — switches it off
+ - `release(self) -> None` — drops weights at session teardown
+- Three stub subclasses raising `NotImplementedError` on each verb:
+ - `LocalFileBinding` — Reality A
+ - `EmbeddedBinding` — Reality B
+ - `ServerMediatedBinding` — Reality C (blocked on #27)
+- `AdapterSchemaMismatchError` exception class with attributes: `name`, `observed_keys`, `expected_keys`. Message format: `"Adapter '{name}' output cannot satisfy declared contract. Observed keys: {observed_keys}; expected: {expected_keys}."`
+
+**Placeholder module (folded in from former issue 0.3):**
+
+- New module path: `mellea.stdlib.components.adapter_based_component` (placeholder)
+- Re-exports today's `Intrinsic` class as `AdapterBasedComponent`
+- Old import path `mellea.stdlib.components.intrinsic` stays valid
+- Module docstring notes the placeholder rationale and that the module will be renamed when IBM confirms the post-"Intrinsic" name, with one minor release of overlap
+
+**`KNOWN_ROLES` advisory registry (§17 Q2):**
+
+- New constant: `mellea/backends/adapters/roles.py` — `KNOWN_ROLES: frozenset[str]` containing the initial known role strings (e.g. `"requirement-check"`, `"answerability"`, `"guardian"`, `"factuality"`)
+- `Identity` construction warns (`UserWarning`) when `role` is set to a value not in `KNOWN_ROLES`; does not reject it — `role` stays free-form
+- The registry is advisory: downstream code and new adapter authors consult it to avoid typos; it is not a schema enforcement point
+
+### Scope
+
+- New module: `mellea/backends/adapters/_core.py` (or equivalent) with the new types
+- New module: `mellea/backends/adapters/roles.py` with `KNOWN_ROLES`
+- New module: `mellea/stdlib/components/adapter_based_component/__init__.py` re-exporting `Intrinsic` as `AdapterBasedComponent`
+- Imports the new types and `KNOWN_ROLES` into `mellea/backends/adapters/__init__.py` for downstream use
+- Existing `IntrinsicAdapter` / `EmbeddedIntrinsicAdapter` / `CustomIntrinsicAdapter` and the existing `Adapter` ABC are **not** modified in this issue (1.A handles those)
+
+### Out of scope
+
+- Any caller migration (1.A and per-file issues)
+- Any binding verb implementation beyond `NotImplementedError` (Phase 2)
+- Removal of old classes (4.1)
+- Catalogue revision pinning (0.2)
+- Renaming the AST class itself or rewriting prose (sequenced when IBM confirms the post-"Intrinsic" name)
+- Observability spans on the verbs (added when Phase 2 implements them)
+
+### Acceptance criteria
+
+- [ ] `Adapter`, `Identity`, `IOContract`, `WeightsBinding` types exist and are importable from `mellea.backends.adapters`
+- [ ] `IOContract` ABC enforces both `build_prompt` and `parse` as abstract
+- [ ] `WeightsBinding` ABC enforces all four verbs as abstract
+- [ ] `LocalFileBinding`, `EmbeddedBinding`, `ServerMediatedBinding` exist as concrete subclasses, each raising `NotImplementedError` on each verb
+- [ ] `AdapterSchemaMismatchError` exists, carries the three attributes, formats messages correctly
+- [ ] `from mellea.stdlib.components.adapter_based_component import AdapterBasedComponent` works
+- [ ] `AdapterBasedComponent is Intrinsic` evaluates True (same class object, not a wrapper)
+- [ ] Existing imports from `mellea.stdlib.components.intrinsic` continue to work
+- [ ] Naming-collision resolution (old `Adapter` ABC vs new `Adapter` dataclass) documented in PR description
+- [ ] `KNOWN_ROLES` importable from `mellea.backends.adapters`; `Identity(role="unknown-role")` emits a `UserWarning`; `Identity(role="answerability")` does not warn
+- [ ] Unit tests cover: type construction, ABC enforcement (cannot instantiate without overriding), `AdapterSchemaMismatchError` formatting, both placeholder import paths, `KNOWN_ROLES` warning behaviour
+- [ ] Existing tests pass unchanged (no caller migration in this issue)
+- [ ] `ruff format`, `ruff check`, `mypy` clean
+
+### Test plan
+
+New tests under `test/backends/adapters/test_core_types.py`:
+- `test_adapter_dataclass_construction`
+- `test_identity_validation` — adapter_type literal enforcement
+- `test_io_contract_abc_enforcement` — cannot instantiate without overriding methods
+- `test_weights_binding_abc_enforcement` — same for the four verbs
+- `test_stub_binding_subclasses_raise_not_implemented` — each verb on each subclass
+- `test_adapter_schema_mismatch_error_format` — message string includes name + observed + expected keys
+
+New test under `test/stdlib/components/test_adapter_based_component.py`:
+- `test_adapter_based_component_is_intrinsic` — alias and original are the same class
+- `test_both_import_paths_work` — old and new module imports succeed
+
+New tests under `test/backends/adapters/test_roles.py`:
+- `test_known_roles_is_frozenset`
+- `test_unknown_role_warns` — `Identity(name="x", adapter_type="lora", role="typo-role")` emits `UserWarning`
+- `test_known_role_does_not_warn`
+- `test_none_role_does_not_warn` — role is optional
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| Naming collision: existing `Adapter` ABC and new `Adapter` dataclass share the name | Implementer documents the chosen resolution (alias vs old-rename) in PR description; reviewers confirm before merge |
+| New types diverge subtly from granite-common / granite-formatters expectations | `IOContract.build_prompt` delegates rather than re-implements; tests assert delegation, not re-implementation |
+| `AdapterSchemaMismatchError` swallowed somewhere upstream and turned back into silent False | Exception attributes are deliberate; 1.C tests will assert it propagates through `requirement_check_to_bool` |
+| `AdapterBasedComponent` placeholder name leaks into user-facing prose / docs prematurely | Module docstring explicitly tags it as a placeholder; prose rewrites are out-of-scope here |
+
+### Breaking Changes
+
+None at the public-API level. Internal contributors who import `Adapter` from `mellea.backends.adapters.adapter` (the old ABC) may need to update if the implementer chooses to move that ABC to a private name — flagged in PR description.
+
+### Impinging Issues / PRs
+
+- #1018 — granite-switch on HF backend; blocked behind the new types becoming the canonical extension point
+- #1080 (this proposal) — closes once issues filed
+- #1111 — already filed (output-schema versioning); the `AdapterSchemaMismatchError` introduced here is the surface that #1111's versioning will eventually wrap
+
+### References
+
+- PR #1080 design proposal §4 (rough end result), §9 (end-state design detail), §9.2 (weights binding verbs per reality), Part I §5 Q5 (placeholder rationale)
+- Jake req 4 (helpers raise on contract mismatch) — see also #1111
+
+---
+
+## 0.2 — Pin catalogue entries to HF revision SHAs
+
+**Parent:** #929 · **Blocks:** 2.2 (revision-aware `prepare`) · **Phase:** 0
+
+### Problem
+
+The intrinsic catalogue (`mellea/backends/adapters/catalog.py`) does not record which *revision* of an upstream HF repository it expects. When upstream pushes new weights, every Mellea install silently picks them up — and if those weights have a different output schema, the helper that depends on the old schema breaks silently.
+
+PR #1008 is the worked example: `requirement-check` output changed from `{"requirement_likelihood": 0.9}` to `{"requirement_check": {"score": 0.9}}` upstream. `requirement_check_to_bool` returned `False` for every call until someone noticed.
+
+Verified state on main (2026-05-21): `IntriniscsCatalogEntry` has fields `name`, `internal_name`, `repo_id`, `adapter_types` — no `revision`. There are 14 catalogue entries across `_RAG_REPO`, `_CORE_REPO`, `_CORE_R1_REPO`, `_GUARDIAN_REPO`. Note: the type name typo `IntriniscsCatalogEntry` (missing `i`) is intentional convention to preserve — do not "fix" as part of this issue.
+
+### Agreed design
+
+Each catalogue entry gains a `revision` field pinned to a specific 40-character HF commit SHA. Mellea pins to that SHA when auto-loading the adapter. Callers can opt into tracking-latest by passing `revision="main"` explicitly, accepting the behavioural-drift risk.
+
+This is Jake req 5.
+
+**Catalogue deduplication (thread 6):** The catalogue currently carries two entries for the same adapter: `requirement_check` (underscore) and `requirement-check` (hyphen). The design resolves thread 6 by making the catalogue an optional resolver, keeping one canonical entry. Collapse to a single `requirement_check` entry with `role="requirement-check"` set on the `Identity`; the role-based lookup introduced in 1.A routes correctly regardless of the key used at construction time. This removes dead dead-state from the catalogue before 1.A adds the role-based lookup.
+
+### Scope
+
+- `mellea/backends/adapters/catalog.py` — add `revision` field to `IntriniscsCatalogEntry`, populate every entry with the current upstream HF SHA at the time of this PR
+- Validation: `revision` must be a 40-character lowercase hex string OR the literal `"main"`
+- Validation function lives next to the catalogue type
+- Collapse the `requirement_check` / `requirement-check` duplicate catalogue entries into one canonical `requirement_check` entry (with `role="requirement-check"`)
+- Update any catalogue-construction examples in `docs/examples/` and `test/` to include the new field and remove the duplicate entry reference
+
+### Out of scope
+
+- Any change to `prepare()` behaviour (issue 2.2 implements `LocalFileBinding.prepare` to *use* the pinned revision)
+- Refresh policies for long-running sessions (issue 2.2; see also #1111)
+- Auto-bumping the SHA when upstream pushes (manual maintenance for now)
+- "Fixing" the `IntriniscsCatalogEntry` typo (preserve as-is)
+
+### Acceptance criteria
+
+- [ ] All 14 catalogue entries have a `revision` field with a 40-char hex SHA matching the current upstream
+- [ ] Revision validation rejects malformed values (too short, non-hex, etc.) with a clear error
+- [ ] `"main"` is accepted as the explicit opt-in for tracking-latest
+- [ ] Tests cover: valid SHA accepted, malformed SHA rejected, `"main"` accepted, `None` handling documented (implementer's choice — accept-as-main or reject-with-error — must be tested explicitly)
+- [ ] `requirement_check` and `requirement-check` catalogue entries collapsed to one (`requirement_check` with `role="requirement-check"`); no duplicate key
+- [ ] Existing tests pass; helpers continue to function unchanged (the new field is metadata only at this stage)
+- [ ] `ruff format`, `ruff check`, `mypy` clean
+
+### Test plan
+
+New tests under `test/backends/adapters/test_catalog_revision.py`:
+- `test_catalog_entries_have_revision` — every entry has the field set to a valid value
+- `test_revision_validation_rejects_malformed` — short, long, non-hex
+- `test_revision_validation_accepts_main_literal`
+- `test_revision_round_trip` — construct entry, retrieve, assert preserved
+- `test_no_duplicate_requirement_check_entry` — catalogue has exactly one entry matching the requirement_check/requirement-check family
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| SHA pinned at PR-write time is already stale by merge time | Reviewer re-fetches upstream HEAD just before merge; documented in PR description |
+| Implementer pins to a revision whose schema is ALREADY broken vs current Mellea code | Validation is mechanical; behavioural correctness verified by existing helper tests passing post-merge |
+| Field added but no enforcement until 2.2 ships | Documented as "metadata-only at this stage" in module docstring; 2.2 dependency relationship called out |
+| Pydantic validator perf hit if called on every catalogue access | Catalogue is constructed once at import; validator runs at construction, not access |
+
+### Breaking Changes
+
+None for end users. Anyone constructing `IntriniscsCatalogEntry` directly (downstream forks, tests outside Mellea) must add the new field or accept the implementer's `None` handling.
+
+### Impinging Issues / PRs
+
+- PR #1008 — the schema flip that motivated this; reference in PR description
+- #1111 — versioning for output schemas (deferred); this issue is the upstream half (pin the input weights), #1111 is the downstream half (version the output contract)
+
+### References
+
+- PR #1080 design proposal §17 Q6 (version pinning for auto-loaded adapters), §6 risk discussion
+- PR #1008 — worked example of silent schema drift
+- Jake req 5
+
+---
+
+# Phase 1 — callers move
+
+## 1.A — Internal migration with shims (Phase 1 foundation)
+
+**Parent:** #929 · **Depends on:** 0.1 · **Blocks:** 1.B, 1.C, 1.D, 2.1, 4.1 · **Phase:** 1
+
+### Problem
+
+`mellea/stdlib/components/intrinsic/_util.py:call_intrinsic` and the requirement-rerouting code in `mellea/stdlib/requirements/requirement.py` both branch on the old `IntrinsicAdapter` / `EmbeddedIntrinsicAdapter` subclasses. Until these internal callers operate on the new `Adapter` type from 0.1, no helper can migrate.
+
+External users may also be constructing the old classes directly (e.g. for custom intrinsics). Migrating internal callers without a backward-compat path would break them.
+
+### Agreed design
+
+This issue does two tightly-coupled things in one PR — splitting them creates ordering pain and conflicting branch state. Combined, they form a single coherent change: "internal code now operates on `Adapter`; old constructors keep working via subclass shims."
+
+**(a) Old classes become inheriting shims.** `IntrinsicAdapter`, `EmbeddedIntrinsicAdapter`, and `CustomIntrinsicAdapter` are restructured so they:
+
+- **Inherit from `Adapter`** (the new dataclass from 0.1) — `isinstance(x, IntrinsicAdapter)` continues to work, and any `isinstance(x, Adapter)` check is also satisfied.
+- Translate constructor arguments into the equivalent `Identity` + `IOContract` + `WeightsBinding` triple, then call `Adapter.__init__` with that triple.
+- Emit a `DeprecationWarning` once per construction site (`stacklevel=2`), pointing at the new construction pattern.
+- Carry no behavioural state of their own — every method delegates to the inherited `Adapter` machinery.
+
+**(b) Internal callers operate on `Adapter`.** `_util.call_intrinsic` and requirement rerouting are rewritten:
+
+```python
+adapter = backend.resolve_adapter(name)
+with backend.adapter_scope(adapter):
+ raw = backend.generate(adapter.io_contract.build_prompt(...))
+return adapter.io_contract.parse(raw)
+```
+
+`adapter_scope` wraps `activate()` / `deactivate()` per call. `prepare()` happens at session start (issue 2.2); `release()` at session teardown.
+
+Role-based lookup for requirement rerouting uses `Identity.role` instead of `isinstance` branching on subclass.
+
+Backend-side: `resolve_adapter` and `adapter_scope` are new methods on the abstract backend. Real implementations come in Phase 2; for now, existing backends grow stub implementations that delegate to the old code paths via a temporary internal shim. This stub is intentional — it lets internal callers migrate while Phase 2 fills in the backend verbs.
+
+### Why this is one issue / one PR
+
+- Splitting (a) and (b) creates an ordering question with no clean answer.
+- Combined, the change is ~300–500 LOC across `_util.py`, `requirement.py`, three old class definitions, and the abstract backend stub. Reviewable as a single coherent PR.
+- Once merged, every Phase 1 per-file PR becomes a small, independent change touching one helper file each.
+
+### Scope
+
+- `mellea/backends/adapters/__init__.py` — old classes restructured as shims inheriting from `Adapter`
+- `mellea/stdlib/components/intrinsic/_util.py` — `call_intrinsic` rewritten
+- `mellea/stdlib/requirements/requirement.py` — rerouting rewritten
+- Abstract backend (and concrete backend stubs): `resolve_adapter`, `adapter_scope` methods added; backed by temporary internal delegation to old code paths. `adapter_scope` is **the future telemetry parent** — Phase 2 will wrap it in an `intrinsic.call` OTel span; stubs here do not add instrumentation yet, but the hook point must exist at the right call-site boundary so Phase 2 can instrument in one place.
+- `docs/dev/requirement_aLoRA_rerouting.md` — update to describe role-based lookup (using `Identity.role`) instead of the previous hardcoded `requirement-check` string; this is the direct resolution of thread 7 in the design proposal's thread mapping
+
+### Out of scope
+
+- Any helper file migration (1.B–D)
+- Backend verb implementations (2.2, 2.3)
+- `AdapterMixin` rename/narrow (2.1)
+- Final shim removal (4.1)
+
+### Acceptance criteria
+
+- [ ] `IntrinsicAdapter(...)` returns a subclass instance that satisfies both `isinstance(x, IntrinsicAdapter)` and `isinstance(x, Adapter)`
+- [ ] Same for `EmbeddedIntrinsicAdapter` and `CustomIntrinsicAdapter`
+- [ ] Each old constructor emits exactly one `DeprecationWarning` per call (not per import), with `stacklevel=2`
+- [ ] `_util.call_intrinsic` operates on `Adapter`; no `isinstance` branching on old subclasses
+- [ ] Requirement rerouting uses `Identity.role`
+- [ ] `adapter_scope` exists at the correct call-site boundary (ready for Phase 2 span wrapping); implementation is a pass-through context manager at this stage
+- [ ] `docs/dev/requirement_aLoRA_rerouting.md` updated to describe role-based lookup; markdownlint passes
+- [ ] All existing tests pass — behavioural neutrality is the bar
+- [ ] Explicit test for "external user constructs old class" path still works (drop-in replaceability)
+- [ ] `ruff format`, `ruff check`, `mypy` clean
+
+### Test plan
+
+- Existing helper and backend tests pass without modification
+- New tests under `test/backends/adapters/test_old_class_shims.py`:
+ - `test_old_classes_inherit_from_adapter`
+ - `test_old_constructor_emits_deprecation_warning` — once per call, `stacklevel=2`
+ - `test_old_constructor_drop_in_replaceable` — construct via old API, assert behaviour matches direct `Adapter(...)` construction
+- New tests for the role-based lookup with multiple registered roles
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| Shims silently change behaviour vs old classes | Behavioural-neutrality test suite (existing tests) is the gate; explicit drop-in replaceability test |
+| `DeprecationWarning` spam for users with deep call stacks | `stacklevel=2`; documentation of the migration path in the warning message |
+| `resolve_adapter` / `adapter_scope` stubs in backends drift from real Phase-2 behaviour | Stubs delegate to old code paths so behaviour is unchanged; Phase-2 PRs replace internals while keeping the surface stable |
+| External users override `_simplify_and_merge` or other internals on old classes | Document in PR that subclassing old classes is unsupported for the deprecation period; surface in changelog |
+| 300–500 LOC PR runs into review fatigue | PR description leads with "this is the only Phase-1 bottleneck"; reviewers know everything else is parallel after this |
+
+### Breaking Changes
+
+- **Subclassing the old classes** — anyone subclassing `IntrinsicAdapter` etc. and overriding internal methods may break. The public constructor signature is preserved; only internal structure changes.
+- **`isinstance` checks against the old `Adapter` ABC** — if anyone external relies on the abstract base class identity, the implementer's resolution from 0.1 (alias vs rename) determines whether this breaks.
+
+### Impinging Issues / PRs
+
+- #1018 — granite-switch on HF backend; uses the new `resolve_adapter` / `adapter_scope` surface introduced here
+- PR #972 — historic precedence bug in `_simplify_and_merge`; this issue does not touch that code path (2.1 does)
+- #1080 §17 Q3 — auto-context discovery decision; consumed by 1.D, not here
+
+### References
+
+- PR #1080 §9 (end-state design detail), §11 (why current code is tangled), §16 Phase 1 first step
+- #929 thread mapping rows 1, 2, 3
+
+---
+
+## 1.B — RAG helpers (`rag.py` whole-file) per-file migration
+
+**Parent:** #929 · **Depends on:** 1.A · **Phase:** 1
+
+### Problem
+
+`mellea/stdlib/components/intrinsic/rag.py` contains six helpers in one file: `check_answerability`, `rewrite_question`, `clarify_query`, `find_citations`, `check_context_relevance`, `flag_hallucinated_content`. All currently construct via `IntrinsicAdapter(...)` (now a deprecation shim after 1.A). Signatures do not consistently accept `model_options=`. None has output validation — schema drift from upstream `_RAG_REPO` weights would silently change behaviour.
+
+Per-helper PRs would all touch the same file and serialise behind one another. One PR migrates the whole file.
+
+### Agreed design
+
+**Follows the per-helper-file migration template** (top of this document). File-specific deltas:
+
+- **Six helpers, six declared output contracts** — each helper declares its own contract; implementer confirms each against current weights before writing the contract.
+- **Forward-compat tolerance applies per helper** — extra optional fields in upstream output do NOT raise.
+- **No `documents=` parameter on any of these** — that's specific to factuality (1.D).
+
+### Scope
+
+- `mellea/stdlib/components/intrinsic/rag.py`
+- Tests under `test/stdlib/components/intrinsic/test_rag.py` (or split per helper if that file exists)
+
+### Out of scope
+
+- Other helper files (1.C, 1.D)
+- Backend changes (Phase 2)
+- Removing the old `IntrinsicAdapter` shim (4.1)
+- "Fixing" or refactoring helper signatures beyond the additive `model_options=` (preserve existing positional args)
+
+### Acceptance criteria
+
+See common acceptance criteria. Plus, for each of the six helpers:
+
+- [ ] Constructs via `Adapter(identity=..., io_contract=..., weights=...)`
+- [ ] Accepts `model_options: dict | None = None`
+- [ ] Has a declared output contract documented in its docstring
+- [ ] Forward-compat: synthetic output with an extra optional field does NOT raise
+- [ ] Contract-break: synthetic output missing a required field raises `AdapterSchemaMismatchError`
+
+### Test plan
+
+See common test plan, applied per helper. Each helper gets its own pair of contract tests (pass and fail).
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| Implementer guesses an output contract that doesn't match current weights | Each helper's contract is verified against current `_RAG_REPO` weights before writing the test; PR description documents the verification |
+| Six contract changes in one PR creates a large diff | Diff is mechanical (one pattern repeated six times); reviewer can spot-check 1-2 helpers and trust the rest |
+| `flag_hallucinated_content` returns a boolean-like that downstream callers coerce informally | Document expected output type explicitly in helper docstring; coercion responsibility stays with caller |
+
+### Breaking Changes
+
+None at signature level (additive `model_options=`). Behaviour is neutral; only adds the `AdapterSchemaMismatchError` that previously did not exist (callers that swallowed `KeyError` will now see `AdapterSchemaMismatchError` — surface in changelog).
+
+### Impinging Issues / PRs
+
+- PR #1080 §13 (what users see) — design surface for these helpers
+- Any open issue against a specific RAG helper (implementer should `gh issue list --search "rag" --state open` before starting and reference any active threads)
+
+### References
+
+- PR #1080 §13, Jake req 4
+- Per-helper-file migration template (top of this document)
+
+---
+
+## 1.C — `requirement_check` per-helper migration
+
+**Parent:** #929 · **Depends on:** 1.A · **Phase:** 1
+
+### Problem
+
+`requirement_check` and `requirement_check_to_bool` (in `mellea/stdlib/requirements/requirement.py`) currently route through `IntrinsicAdapter(...)`. PR #1008 is the canonical evidence that schema drift here is real and previously silent — the output schema flipped from `{"requirement_likelihood": 0.9}` to `{"requirement_check": {"score": 0.9}}`, and `requirement_check_to_bool` returned `False` for every call until someone noticed.
+
+Today (verified on main 2026-05-21): `requirement_check_to_bool` uses the post-#1008 schema `req_dict["requirement_check"]["score"]` and returns `False` with a warning on schema mismatch — silent failure mode.
+
+### Agreed design
+
+**Follows the per-helper-file migration template.** File-specific deltas:
+
+- **Output contract:** the helper's declared output is the post-#1008 shape `{"requirement_check": {"score": float}}`. Implementer confirms against current weights.
+- **Schema-mismatch is loud here in particular** — this helper is the design's worked example. If a future upstream change breaks the contract, callers must see `AdapterSchemaMismatchError` immediately rather than silently-wrong booleans.
+- **`requirement_check_to_bool` wrapper** stops returning `False` on parse failure — it now propagates `AdapterSchemaMismatchError`. This IS a behavioural change vs main; flagged below as a breaking change.
+
+### Scope
+
+- `mellea/stdlib/components/intrinsic/` — wherever `requirement_check` lives
+- `mellea/stdlib/requirements/requirement.py` — `requirement_check_to_bool`
+- Tests covering both functions
+
+### Out of scope
+
+- Other helpers (1.B, 1.D)
+- Auto-bumping catalogue revision when upstream pushes (project-wide, not here)
+- Catalogue deduplication — the `requirement_check` / `requirement-check` duplicate entries are collapsed in issue 0.2; by the time this issue is worked, only the `requirement_check` entry (with `role="requirement-check"`) exists in the catalogue. This issue resolves whichever canonical entry the role-based lookup (from 1.A) surfaces.
+
+### Acceptance criteria
+
+See common acceptance criteria. Plus:
+
+- [ ] `requirement_check(...)` returns the expected shape on happy path
+- [ ] `requirement_check_to_bool(...)` returns `bool` on happy path
+- [ ] `requirement_check_to_bool(...)` propagates `AdapterSchemaMismatchError` on schema-break (NOT silent `False`)
+- [ ] Synthetic output `{"requirement_check": {"score": 0.9}}` parses correctly
+- [ ] Synthetic output `{"requirement_likelihood": 0.9}` (the pre-#1008 shape) raises `AdapterSchemaMismatchError`
+
+### Test plan
+
+See common test plan. Plus a regression test specifically named after #1008 demonstrating that the pre-#1008 shape now raises rather than silently coerces to `False`.
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| Callers depending on the silent-`False` behaviour break | Flagged as breaking change in changelog; behaviour change documented in PR description; prior silent failure was a bug, not a feature |
+| Both `requirement_check` and `requirement-check` catalogue entries exist (3.2/3.3 compat) | Out-of-scope to change; helper resolves whichever entry the running model needs |
+| `requirement_check_to_bool` is on a hot path (every requirement check) | Validation is one extra dict access on the parsed output; negligible perf cost |
+
+### Breaking Changes
+
+- **`requirement_check_to_bool` no longer returns silent `False` on parse failure** — it now raises `AdapterSchemaMismatchError`. Any caller that wraps this in a try/except and treats failure as "requirement not met" must update to handle the exception explicitly. Surface in changelog.
+
+### Impinging Issues / PRs
+
+- PR #1008 — the original schema flip; reference in PR description
+- **PR #1078** — open PR adding canned I/O test data for `requirement-check` and `uncertainty` formatters (under `test/formatters/granite/`). Merge before starting 1.C; this PR's test data is the foundation the 1.C tests build on.
+- 0.2 — catalogue revision pinning; once 0.2 ships, the upstream half of "no silent drift" is in place; 1.C is the downstream half
+
+### References
+
+- PR #1080 §14.1 #2 (silent schema drift worked example)
+- PR #1008 — the original schema flip
+- Jake req 4
+
+---
+
+## 1.D — `guardian.py` whole-file migration (Guardian + factuality + auto-context)
+
+**Parent:** #929 · **Depends on:** 1.A · **Phase:** 1
+
+### Problem
+
+`mellea/stdlib/components/intrinsic/guardian.py` (verified on main, 221 lines) contains FOUR helpers in one file:
+
+- `policy_guardrails`
+- `guardian_check` (Guardian family core)
+- `factuality_detection`
+- `factuality_correction`
+
+All currently construct via the old API. Splitting into separate PRs (Guardian vs factuality) would force same-file conflicts. One PR migrates the whole file.
+
+Additional file-specific concerns:
+
+- Factuality helpers historically take `documents` as a positional argument — inconsistent with kw-only patterns elsewhere.
+- Factuality return type changed from `float` to `str` per #1003 (closed PR #1028 inherited the direction).
+- When `documents=None`, callers are forced to pass documents explicitly even when those documents are already in conversation context.
+
+### Agreed design
+
+**Follows the per-helper-file migration template.** File-specific deltas — beyond the template, this PR also does:
+
+**(i) Family-level migration of all four helpers in `guardian.py`.** Each helper has its own declared output contract; validate per-helper.
+
+**(ii) `documents=` keyword-only on factuality helpers.** `factuality_detection` and `factuality_correction` accept `documents: list[str] | None = None` as keyword-only. Default `None` triggers auto-discovery (see iv). Other Guardian helpers (`policy_guardrails`, `guardian_check`) do NOT take `documents=`.
+
+**(iii) Factuality return type `str`.** Per #1003 / closed PR #1028, both factuality functions return `str` (was `float`). Update tests accordingly.
+
+**(iv) Auto-context document discovery (factuality only).** When `documents=None`, the helper auto-discovers user-supplied documents from the conversation context. **Mechanism:** documents flow through ordinary conversation context (e.g. as part of a `Message`'s content). The helper reads whatever documents are present in the context it receives. The caller is responsible for *populating* that context — via explicit `documents=`, prior `Message`s, retrieval, etc. **No `_docs`-specific extraction path.** This adopts the direction of PR #1028 but not its specific code path; per intrinsics-team guidance recorded in PR #1080 §17 Q3 (2026-05-20).
+
+### Why one PR
+
+These four helpers all live in `guardian.py`. Three theoretical PRs (Guardian-core / factuality-pair / auto-context) would conflict on the same file. Combined, the work is one PR ~300–400 LOC, reviewable as one coherent change anchored on "everything in `guardian.py` is on the new types and consistent."
+
+### Scope
+
+- `mellea/stdlib/components/intrinsic/guardian.py`
+- Tests under `test/stdlib/components/intrinsic/test_guardian.py` (or split per helper if that structure already exists)
+
+### Out of scope
+
+- Other helper files (1.B, 1.C)
+- Reviving #1028's `_docs` scanning code (explicitly shelved)
+- Modifying how callers populate context (caller's responsibility, by design)
+- Adding new Guardian capabilities
+
+### Acceptance criteria
+
+See common acceptance criteria. Plus, per helper:
+
+**`policy_guardrails`, `guardian_check`:**
+- [ ] Construct via new types
+- [ ] Accept `model_options: dict | None = None`
+- [ ] Each has a declared output contract; contract-break raises, forward-compat does not
+- [ ] No `documents=` parameter
+
+**`factuality_detection`, `factuality_correction`:**
+- [ ] Construct via new types
+- [ ] Accept `model_options: dict | None = None`
+- [ ] `documents: list[str] | None = None` is keyword-only; positional second arg raises `TypeError`
+- [ ] Return type is `str` (per #1003)
+- [ ] `documents=[...]` works (explicit pass-through, auto-discovery skipped)
+- [ ] `documents=None` with documents in conversation context works (auto-discovery picks them up)
+- [ ] No `_docs`-specific code path exists in this file
+- [ ] Behaviour when no documents are anywhere (`documents=None` and no context documents) is documented and tested explicitly — implementer's choice between sentinel return and explicit error
+
+### Test plan
+
+See common test plan, applied per helper. Plus:
+
+- `test_factuality_detection_explicit_documents` — pass-through
+- `test_factuality_detection_auto_discovery_from_context` — picks up documents from prior Messages
+- `test_factuality_detection_no_documents_anywhere` — implementer-chosen behaviour, documented in test name
+- `test_factuality_documents_kwonly` — positional second arg raises `TypeError`
+- `test_factuality_return_type_is_str`
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| Auto-context discovery picks up documents the user did not intend as factuality sources | Helper docstring documents the contract: "any document present in the context is a candidate"; caller-side responsibility framing |
+| `documents=` becoming kw-only is a breaking change for callers using positional form | Surface in changelog; deprecation period not feasible (kw-only is mechanical, can't dual-support cleanly without runtime introspection); call out as a breaking change with migration example |
+| Return-type change `float → str` is already in place from #1003; this PR formalises the contract | Test specifically asserts `str`; fail-loud if a future change tries to revert |
+| Four helpers in one PR creates a wide diff | Reviewer can spot-check Guardian helpers and factuality helpers as two coherent sections |
+| Auto-discovery interacts poorly with #1080's broader context model | Document the mechanism explicitly; if the broader context model changes, this helper updates with it |
+
+### Breaking Changes
+
+- **`factuality_detection` / `factuality_correction` `documents=` becomes keyword-only.** Positional callers must update.
+- **Return type already changed in main per #1003** — this PR codifies it as a contract; no further user impact.
+
+### Impinging Issues / PRs
+
+- PR #1080 §17 Q3 (resolved 2026-05-20) — auto-context decision
+- PR #1028 (closed) — direction inherited, mechanism shelved
+- #1003 — original signature/return-type scope
+- intrinsics-team guidance recorded in §17 Q3
+- **#1094** — "Migrate creating_a_new_type_of_session.py example off deprecated GuardianCheck" — close with this issue; include the example migration in this PR's scope (add `docs/examples/sessions/creating_a_new_type_of_session.py` to scope)
+- **#1071** — "feat: intrinsic-backed Requirement subclass for Guardian safety validation" — DO NOT start before this issue merges; once merged, the new `Adapter` types from Phase 0/1 are the right foundation for that feature
+- **PR #935** — open docs PR touching `docs/docs/advanced/intrinsics.md` and guardian examples; coordinate or merge before starting this issue to avoid doc conflicts
+
+### References
+
+- PR #1080 §13 (what users see), §17 Q3
+- Per-helper-file migration template (top of this document)
+
+---
+
+# Phase 2 — backends move
+
+## 2.1 — `AdapterMixin` verb rename/narrow + `resolve_model_options` centralisation
+
+**Parent:** #929 · **Depends on:** 1.A · **Blocks:** 2.2, 2.3 · **Phase:** 2
+
+### Problem
+
+Two related backend-surface concerns ship together:
+
+**(a) Mixin verb mismatch.** Today's `AdapterMixin` (verified on main, `mellea/backends/adapters/adapter.py:240`) exposes five verbs: `base_model_name`, `add_adapter`, `load_adapter`, `unload_adapter`, `list_adapters`. The proposed verbs in PR #1080 §13 are different in shape and naming: `load_peft_adapter`, `unload_peft_adapter`, `render_controls`, `set_request_adapter`. This is a **rename + introduce + narrow**, not a pure narrow. The existing verbs do not map 1:1 onto the proposed verbs.
+
+**(b) Per-call option merging.** Each backend calls `_simplify_and_merge` on every adapter call to combine model options from various sources (caller, helper defaults, backend defaults). Duplicated logic and a known source of precedence bugs (PR #972 was a fix).
+
+Both concerns touch the same backend mixin surface and the same set of backend implementations (`LocalHFBackend`, `OpenAIBackend`). Combining keeps the backend-surface change as one coherent PR.
+
+### Agreed design
+
+**(a) Mixin verb rename/narrow.** `AdapterMixin` ends up with the proposed verb set:
+
+```
+AdapterMixin (post-2.1):
+ load_peft_adapter / unload_peft_adapter # for LocalFile reality
+ render_controls # for Embedded reality
+ set_request_adapter # for ServerMediated reality
+```
+
+Implementer maps existing implementations onto the new verbs (`load_adapter` → `load_peft_adapter` is the obvious starting point; map case-by-case). `base_model_name` and `list_adapters` are removed if unused after 1.A; otherwise relocated or kept with documented justification.
+
+**Bindings (Phase 2.2/2.3) call into these from their `prepare`/`activate`/etc. implementations.**
+
+**(b) Option resolution.** New utility `mellea/backends/_options.py:resolve_model_options` (or similar) documents and implements the precedence: explicit caller-passed `model_options=` > helper defaults > backend defaults. Each adapter-supporting backend's adapter call path replaces `_simplify_and_merge` with `resolve_model_options`.
+
+### Scope
+
+- `mellea/backends/adapters/adapter.py` — `AdapterMixin` verb rename + narrow
+- New utility: `mellea/backends/_options.py:resolve_model_options`
+- Backends affected: `LocalHFBackend`, `OpenAIBackend` (the two adapter-supporting backends per PR #1080 §10) — verb implementations renamed/narrowed; `_simplify_and_merge` calls replaced
+- Backends NOT affected for adapters: `OllamaBackend`, `WatsonxBackend`, `LiteLLMBackend` (no adapter support)
+- **`IntrinsicMetricsPlugin`** — new plugin at `mellea/core/plugins/intrinsic_metrics.py` alongside the existing `TokenMetricsPlugin` / `LatencyMetricsPlugin` / error plugins. Registers three OTel metrics:
+ - `mellea.intrinsic.invocations` — counter; labels: `name`, `revision`, `binding_type`, `adapter_type`, `outcome` (`success` / `schema_error` / `error`)
+ - `mellea.intrinsic.phase_duration_ms` — histogram; labels: `name`, `phase` (`prepare` / `activate` / `generate` / `parse` / `deactivate`)
+ - `mellea.intrinsic.parse_failures` — counter; labels: `name`, `revision`. This is the **schema-drift detector**: a climbing counter against a specific `(name, revision)` pair means a breaking schema change was pushed upstream without revision-pinning catching it. Each increment corresponds to an `AdapterSchemaMismatchError` at the call site. Auto-wired via the existing `TokenMetricsPlugin` registration pattern.
+- New dev doc: `docs/dev/adapter_observability.md` — documents the span tree, metric labels, `parse_failures` schema-drift detector pattern, and `MELLEA_TRACE_CONTENT` content-capture gate. Phase 2.2/2.3 will add span emission details; this issue writes the structure document.
+
+### Out of scope
+
+- Binding implementations (2.2, 2.3)
+- Span instrumentation (lives in 2.2/2.3 where the verbs that emit them are implemented)
+- HF backend embedded-adapter support (#1018)
+
+### Acceptance criteria
+
+- [ ] `AdapterMixin` exposes exactly the four verbs documented above; old verbs removed or relocated with justification
+- [ ] Both adapter-supporting backends implement the new verb set
+- [ ] `resolve_model_options` exists, documented, tested
+- [ ] Both adapter-supporting backends call `resolve_model_options` instead of `_simplify_and_merge` on the adapter call path
+- [ ] `IntrinsicMetricsPlugin` exists and can be registered like existing plugins
+- [ ] All three metrics are registered with correct label sets
+- [ ] `mellea.intrinsic.parse_failures` increments on each `AdapterSchemaMismatchError` (wired via the same hook as `schema_error` outcome on `invocations`)
+- [ ] `docs/dev/adapter_observability.md` written, covers span tree structure, metric labels, `parse_failures` pattern, `MELLEA_TRACE_CONTENT` gate; passes markdownlint
+- [ ] Existing tests pass (precedence behaviour unchanged; behavioural neutrality)
+- [ ] New unit tests cover precedence: caller > helper > backend
+- [ ] New test asserts `AdapterMixin` exposes exactly the four verbs (catches accidental re-additions)
+- [ ] Unit tests for `IntrinsicMetricsPlugin`: assert each metric emits with the correct label on a synthetic call; assert `parse_failures` increments on `AdapterSchemaMismatchError`
+- [ ] `ruff format`, `ruff check`, `mypy` clean
+
+### Test plan
+
+- Existing backend tests pass
+- Unit tests for `resolve_model_options` covering each precedence pair (caller-vs-helper, caller-vs-backend, helper-vs-backend)
+- New tests asserting the verb set on `AdapterMixin`
+- Verb-rename: each old verb's test is renamed/relocated to the new verb name; coverage stays intact
+- Unit tests for `IntrinsicMetricsPlugin` using a synthetic OTel exporter (not real infra): `test_invocations_counter_emits_on_success`, `test_invocations_counter_emits_schema_error_outcome`, `test_parse_failures_counter_increments`, `test_phase_duration_histogram_emits_for_each_phase`
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| Verb rename misses a subclass implementation, leaving an abstract method unimplemented at instantiation | `mypy` catches abstract-method-not-implemented; CI gate |
+| External implementations of `AdapterMixin` (downstream forks) break | Surface as breaking change in changelog with migration table (old → new verb names) |
+| `resolve_model_options` introduces a precedence regression vs `_simplify_and_merge` | Existing tests validate precedence; new explicit precedence tests as belt-and-braces |
+| Combining (a) and (b) widens the diff vs splitting | Both touch the same files (backend adapter surface); splitting would cause same-file rebase pain |
+| `base_model_name` / `list_adapters` removal breaks introspection helpers | Search before removal; relocate if any internal helper depends on them |
+
+### Breaking Changes
+
+- **`AdapterMixin` verb rename** — downstream backends extending `AdapterMixin` must update to the new verb names. Migration table in changelog.
+- **`_simplify_and_merge` removal from adapter call path** — internal API change; not part of the public surface, but flagged for any unusual downstream code.
+
+### Impinging Issues / PRs
+
+- PR #972 — historic precedence bug, evidence the centralisation matters; reference in PR description
+- #1018 — HF backend embedded adapters; will use `render_controls` post-2.1
+- PR #881 — Reality B added to OpenAI backend; uses what becomes `render_controls`
+
+### References
+
+- PR #1080 §13 (what users see — detailed), §16 Phase 2 final step
+- PR #972 — precedence bug
+- Verified state on main: `mellea/backends/adapters/adapter.py:240` (AdapterMixin)
+
+---
+
+## 2.2 — `LocalFileBinding` implements verbs (PEFT/aLoRA path)
+
+**Parent:** #929 · **Depends on:** 0.1, 0.2, 2.1 · **Blocks:** 4.1 · **Phase:** 2
+
+### Problem
+
+`LocalFileBinding` is currently a stub from issue 0.1, raising `NotImplementedError` on each verb. Reality A (today's `IntrinsicAdapter`) needs the four verbs working before the binding is usable.
+
+### Agreed design
+
+Implement the four verbs:
+
+- `prepare()` — resolves the configured HF revision (uses `revision` field from issue 0.2; defaults to the catalogue-pinned SHA if not specified). Downloads the PEFT weights via the existing HF download path. Registers with the backend via `AdapterMixin.load_peft_adapter`.
+- `activate(ctx)` — switches the adapter on in the backend (PEFT layer enabled for the next generation). Uses backend's existing PEFT activation primitives.
+- `deactivate(ctx)` — switches the adapter off. Auto-called after each generation by `adapter_scope` (set up in 1.A).
+- `release()` — removes the PEFT adapter (`AdapterMixin.unload_peft_adapter`). Called at session teardown.
+
+`prepare` is session-scoped — called once per session (or per explicit `release()`+`prepare()` cycle for refresh). `activate`/`deactivate` are call-scoped.
+
+### Scope
+
+- `LocalFileBinding` class — verb implementations
+- Backend integration via existing `AdapterMixin.load_peft_adapter` / `unload_peft_adapter` (renamed in 2.1)
+- `LocalFileBinding.from_catalog(name: str) -> LocalFileBinding` classmethod — convenience constructor that looks up the catalogue entry by name (post-0.2 canonical entry, with pinned revision) and returns a fully configured binding. This is the user-facing "standard path" shown in the design proposal's §13 examples: `Adapter(name="answerability", weights=LocalFileBinding.from_catalog("answerability"))`.
+- **Span instrumentation.** `adapter_scope` (from 1.A) is now wrapped in an `intrinsic.call` OTel span. `LocalFileBinding` emits child spans:
+ - `intrinsic.prepare` — attributes: `intrinsic.name`, `intrinsic.revision` (resolved SHA, not "main"), `intrinsic.binding_type="local_file"`, `intrinsic.source` (HF repo ID), download duration
+ - `intrinsic.activate` — attribute: `intrinsic.peft_name`
+ - `intrinsic.deactivate`
+ - `intrinsic.parse` is emitted by `io_contract.parse()` — attributes: `intrinsic.revision`, `intrinsic.parse_ok`, `intrinsic.raw_len`
+- **Content capture** (gated on `MELLEA_TRACE_CONTENT` env var, consistent with #1035 / PR #1036): `intrinsic.input.kwargs`, `intrinsic.output.raw`, `intrinsic.output.parsed` emitted as span events
+- Update `docs/dev/adapter_observability.md` (created in 2.1) with LocalFile-specific span attributes and the resolved-revision attribute
+- Update `docs/docs/advanced/intrinsics.md` to reflect the new `Adapter(weights=LocalFileBinding.from_catalog(...))` construction pattern alongside the existing helper-only examples; the old `IntrinsicAdapter(...)` construction is shown as deprecated with migration note
+- Update `docs/examples/intrinsics/` examples: at least one example must show the new Adapter construction; existing helper-call examples gain `model_options=` where relevant
+
+### Out of scope
+
+- `EmbeddedBinding` (2.3)
+- `ServerMediatedBinding` (3.1)
+- Long-running session refresh policy (deferred — PR #1080 §17 Q5)
+- Full rewrite of `docs/dev/intrinsics_and_adapters.md` (deferred to 4.1 when shims are gone and the final API shape is stable)
+
+### Acceptance criteria
+
+- [ ] All four verbs implemented for `LocalFileBinding`
+- [ ] `prepare()` downloads from HF using the pinned revision; explicit `revision="main"` opts into tracking-latest
+- [ ] `activate` / `deactivate` toggle the backend's PEFT layer correctly
+- [ ] `release()` cleanly unregisters from the backend; second call is a no-op
+- [ ] `LocalFileBinding.from_catalog("answerability")` returns a correctly configured binding with the catalogue's pinned revision
+- [ ] `intrinsic.call` parent span emitted for every adapter call; child spans `intrinsic.prepare`, `intrinsic.activate`, `intrinsic.deactivate`, `intrinsic.parse` emitted with required attributes
+- [ ] `intrinsic.prepare` span records the resolved HF SHA (not `"main"`) as `intrinsic.revision`
+- [ ] `MELLEA_TRACE_CONTENT=1`: content events (`intrinsic.input.kwargs`, `intrinsic.output.raw`, `intrinsic.output.parsed`) present; absent otherwise
+- [ ] `IntrinsicMetricsPlugin` (from 2.1): `invocations` counter increments; `parse_failures` increments on `AdapterSchemaMismatchError`; `phase_duration_ms` histogram records prepare and activate durations
+- [ ] `docs/docs/advanced/intrinsics.md` updated: new construction pattern present, deprecated old pattern noted with migration path
+- [ ] `docs/examples/intrinsics/` examples updated: at least one shows new construction; all examples pass `uv run pytest docs/examples/intrinsics/` (or backend-gated with correct marker)
+- [ ] `docs/dev/adapter_observability.md` updated with LocalFile-specific attributes
+- [ ] Behavioural tests for an end-to-end adapter call (prepare → activate → generate → deactivate → release) pass
+- [ ] `ruff format`, `ruff check`, `mypy` clean
+
+### Test plan
+
+Unit tests (`test/backends/adapters/test_local_file_binding.py`), with mocked HF download and mocked backend:
+- `test_prepare_uses_pinned_revision` — mocked HF download confirms the catalogue SHA is requested, not "main"
+- `test_prepare_allows_main_override` — `revision="main"` passes "main" to the download
+- `test_release_is_idempotent` — second `release()` call is a no-op
+- `test_from_catalog_returns_binding_with_correct_revision`
+- `test_activate_deactivate_call_correct_mixin_verbs`
+- Span assertion tests using a synthetic OTel exporter: `test_call_span_emitted`, `test_prepare_span_has_revision_attribute`, `test_content_events_absent_by_default`, `test_content_events_present_with_gate_set`
+- `test_metrics_invocation_counter_increments` — `IntrinsicMetricsPlugin` wired; assert counter increments on a successful call
+- `test_metrics_parse_failures_increments` — inject a synthetic schema-mismatch; assert `parse_failures` increments
+
+Integration tests (`test/backends/adapters/test_local_file_integration.py`, mark `@pytest.mark.integration`, `@pytest.mark.hf`, `@pytest.mark.slow`):
+- Full integration matrix: `LocalHFBackend × LocalFileBinding × {lora, alora} × {check_answerability, requirement_check}` — prepare → activate → generate → deactivate → release; assert expected output shape
+- Per-version parse round-trip: inject the pre-#1008 `{"requirement_likelihood": 0.9}` output shape; assert `AdapterSchemaMismatchError` (regression test for the silent-failure case)
+
+Qualitative test (optional, `@pytest.mark.qualitative`, kept out of fast loop):
+- `test_check_answerability_quality` — small canonical dataset, accuracy floor on actual adapter output
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| HF download timeout / network failure during `prepare` | Bubble the underlying `huggingface_hub` exception cleanly; document in helper docstring that `prepare` may raise on network failure |
+| `release()` called twice | Make idempotent — second call is a no-op |
+| `activate`/`deactivate` race if generation throws mid-call | `adapter_scope` (from 1.A) is a context manager — `__exit__` calls `deactivate` even on exception |
+| PEFT layer enable/disable cost on hot path | Document expected overhead in PR; not a regression vs current behaviour |
+| Pinned SHA differs from cached SHA from a previous `revision="main"` run | `huggingface_hub` cache key includes revision; correct behaviour by default |
+
+### Breaking Changes
+
+None for end users — the binding replaces internal stubs. Behaviour matches the existing `IntrinsicAdapter` runtime path post-1.A.
+
+### Impinging Issues / PRs
+
+- 0.2 — catalogue revision pinning; this binding is the consumer
+- PR #1080 §17 Q5 — long-running session refresh policy (deferred)
+- #1018 — HF backend embedded adapters; orthogonal to this binding (Embedded uses 2.3)
+
+### References
+
+- PR #1080 §8.1 (Reality A), §9.2 (verbs per reality), §9.3 (lifecycle sequence)
+
+---
+
+## 2.3 — `EmbeddedBinding` implements verbs (Granite Switch path)
+
+**Parent:** #929 · **Depends on:** 0.1, 2.1 · **Blocks:** 4.1 · **Phase:** 2
+
+### Problem
+
+`EmbeddedBinding` is a stub from issue 0.1. Reality B (Granite Switch's embedded adapters; today's `EmbeddedIntrinsicAdapter`, used by the OpenAI backend per PR #881) needs the four verbs working.
+
+### Agreed design
+
+Implement the four verbs for Reality B (embedded — adapters are part of the base-model weights, not a separate file):
+
+- `prepare()` — verifies the base model exposes the required adapter (e.g. controls registry on Granite Switch). No file download.
+- `activate(ctx)` — calls `AdapterMixin.render_controls` to render the adapter's control prompt
+- `deactivate(ctx)` — clears the rendered controls
+- `release()` — no-op (nothing to unload; the adapter is part of the base model)
+
+### Scope
+
+- `EmbeddedBinding` class — verb implementations
+- Uses `AdapterMixin.render_controls` for activation
+- Currently only the OpenAI backend supports Reality B (per PR #881); the HF backend will gain support via #1018, sequenced after this refactor
+- **Span instrumentation.** `EmbeddedBinding` emits child spans under the `intrinsic.call` parent (set up in 2.2):
+ - `intrinsic.prepare` — attributes: `intrinsic.name`, `intrinsic.binding_type="embedded"`, `intrinsic.source` (base model identifier). No download — span records "no-op prepare" outcome.
+ - `intrinsic.activate` — attribute: `intrinsic.controls_key` (name of the controls field rendered into the chat template)
+ - `intrinsic.deactivate`
+- `release()` is a no-op; no span emitted for release
+- Update `docs/dev/adapter_observability.md` (from 2.1) to add Embedded-specific span attributes and note the no-op prepare behaviour
+- Update `docs/docs/advanced/intrinsics.md` to cover the Embedded reality: `Adapter(weights=EmbeddedBinding.from_base_model(backend))` construction example; note which backends support which bindings
+- Update `AGENTS.md §13` (post-parse shapes are now stable for both bindings) with the normalised post-parse shape reference table
+
+### Out of scope
+
+- HF backend support for embedded adapters (#1018)
+- Other realities (2.2, 3.1)
+
+### Acceptance criteria
+
+- [ ] All four verbs implemented for `EmbeddedBinding`
+- [ ] OpenAI backend continues to support embedded adapters via the new binding
+- [ ] Existing Granite Switch tests pass through the new binding
+- [ ] `release()` is a no-op (asserted explicitly in test)
+- [ ] `intrinsic.prepare`, `intrinsic.activate`, `intrinsic.deactivate` spans emitted with `binding_type="embedded"` attribute; prepare span records no-op outcome
+- [ ] `IntrinsicMetricsPlugin` counters increment correctly for Embedded calls
+- [ ] `docs/dev/adapter_observability.md` updated with Embedded-specific span attributes
+- [ ] `docs/docs/advanced/intrinsics.md` updated: Embedded construction example present; backend × reality matrix visible to users
+- [ ] `AGENTS.md §13` updated with normalised post-parse shape reference
+- [ ] `ruff format`, `ruff check`, `mypy` clean
+
+### Test plan
+
+Unit tests (`test/backends/adapters/test_embedded_binding.py`), with mocked OpenAI backend:
+- `test_prepare_is_noop` — no download; no backend call; span records no-op
+- `test_activate_calls_render_controls`
+- `test_deactivate_clears_controls`
+- `test_release_is_noop`
+- `test_multi_call_isolation` — controls from call N do not leak into call N+1
+- Span assertion tests: `test_prepare_span_binding_type_is_embedded`, `test_activate_span_has_controls_key`
+- `test_metrics_invocation_counter_increments_for_embedded`
+
+Integration test (`test/backends/adapters/test_embedded_integration.py`, mark `@pytest.mark.integration`, `@pytest.mark.openai`):
+- `OpenAIBackend × EmbeddedBinding × lora/alora` against Granite Switch — prepare → activate → generate → deactivate → release cycle; assert existing Granite Switch tests still pass
+
+Docs verification:
+- `npx markdownlint-cli2 "docs/docs/advanced/intrinsics.md" "docs/dev/adapter_observability.md" "AGENTS.md"` clean
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| `prepare()` verification probe fails for older base models that lack the controls registry | Bubble a clear error naming the missing capability; document required base-model version in helper docstring |
+| `activate`/`deactivate` controls leak between calls | `adapter_scope` ensures `deactivate` runs; explicit test for multi-call isolation |
+| OpenAI backend uses different control-rendering API in newer versions | Verify against current OpenAI backend implementation before writing; PR description documents the version pinned against |
+
+### Breaking Changes
+
+None for end users — the binding replaces internal stubs. OpenAI-backend Granite Switch behaviour is preserved.
+
+### Impinging Issues / PRs
+
+- PR #881 — Reality B added to OpenAI backend; this binding is the structural home for that work
+- #1018 — HF backend embedded adapters; sequenced after this issue; will reuse `EmbeddedBinding` against `LocalHFBackend`
+
+### References
+
+- PR #1080 §8.2 (Reality B), §9.2, §10 (backend × reality matrix)
+- PR #881 (Reality B added to OpenAI backend)
+
+---
+
+# Phase 3 — Reality C ships (blocked on upstream)
+
+## 3.1 — `ServerMediatedBinding` implementation
+
+**Parent:** #929 · **Depends on:** Phase 2 complete + #27 unblocked · **Phase:** 3
+
+### Status: blocked
+
+Filed for traceability. Sits blocked until vLLM (or another OpenAI-compatible server) supports aLoRA adapters at the API level (#27). Per PR #1080 Part I §5 Q3 (Resolved, Paul): we design the slot but don't invest in stubs while upstream is blocked.
+
+### Problem
+
+OpenAI-compatible servers (notably vLLM) do not currently support aLoRA adapters at the API layer. The OpenAI backend cannot serve adapter-based helpers without falling back to embedded (Reality B) or removing adapter support entirely (as PR #543 did in 2026-04). When upstream gains support, Mellea needs `ServerMediatedBinding` ready to plug in.
+
+### Agreed design
+
+Implement the four verbs for Reality C — server-mediated adapter where the *server* (not Mellea) owns the weights:
+
+- `prepare()` — verifies the server exposes the named adapter (probe API)
+- `activate(ctx)` — calls `AdapterMixin.set_request_adapter` to set the per-request adapter header/parameter
+- `deactivate(ctx)` — clears the per-request adapter
+- `release()` — typically a no-op (server owns the weights)
+
+Specific server-API integration (vLLM, others) is implementer's call when this issue is unblocked.
+
+### Scope
+
+- `ServerMediatedBinding` verb implementations
+- `OpenAIBackend` integration: drop the `_uses_embedded_adapters` hard-code; use `ServerMediatedBinding` for non-embedded adapters when supported
+
+### Out of scope
+
+- Driving vLLM upstream to accept aLoRA support (separate effort, #27)
+- Other servers beyond vLLM-compatible (revisit if a customer asks)
+
+### Acceptance criteria
+
+- [ ] All four verbs implemented for `ServerMediatedBinding`
+- [ ] OpenAI backend uses it for the appropriate adapter type
+- [ ] E2E tests against a real vLLM-compatible server with aLoRA support
+- [ ] Existing OpenAI backend tests still pass
+
+### Test plan
+
+Implementer's call when unblocked. Likely requires a vLLM test fixture or service-level mock.
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| Issue sits blocked for many months and design assumptions go stale | Tracking issue with quarterly check-in; revisit assumptions when #27 has a roadmap |
+| Multiple OpenAI-compat servers diverge on aLoRA API shape | Implement against the first one that lands; abstract per-server quirks behind the binding's internals; surface as follow-up issues |
+| Hard-coded `_uses_embedded_adapters` removal causes regression on Granite Switch path | Existing Reality B tests gate the change; `EmbeddedBinding` (2.3) is the runtime path for that case |
+
+### Breaking Changes
+
+None until unblocked. When implemented, `_uses_embedded_adapters` removal is internal.
+
+### Impinging Issues / PRs
+
+- #27 — vLLM aLoRA support; the actual blocker
+- PR #543 — historic context (adapter support removed from OpenAI backend when vLLM declined aLoRA)
+- PR #881 — embedded adapters added back later
+
+### References
+
+- PR #1080 §8.3 (Reality C), Part I §5 Q3
+- #27 — vLLM aLoRA support
+- PR #543, PR #881
+
+---
+
+# Phase 4 — shim removal
+
+## 4.1 — Remove deprecation shims for old `IntrinsicAdapter` classes
+
+**Parent:** #929 · **Depends on:** 1.A + Phase 2 complete + 1 minor release elapsed · **Phase:** 4
+
+### Status: deferred — file as tracking issue
+
+### Problem
+
+The deprecation shims from 1.A still exist after one minor release of warnings. Mellea wants to remove them to finish the structural refactor cleanly.
+
+Per Part I §5 Q4 (Resolved by Paul/Jake), the deprecation window is at least one minor release (≈ 4–6 weeks), extendable if user impact warrants.
+
+### Agreed design
+
+Delete the shim classes:
+
+- `IntrinsicAdapter`
+- `EmbeddedIntrinsicAdapter`
+- `CustomIntrinsicAdapter`
+
+Update the changelog. If `AdapterBasedComponent` placeholder (introduced in 0.1) has been replaced with IBM's final name by this point, fold that rename in here too — otherwise that's a separate issue when the name lands.
+
+This is also the natural point for a full dev-doc rewrite: the shims are gone, the API shape is final, and the old documentation (which describes the class hierarchy being deleted) is now actively misleading.
+
+### Scope
+
+- Delete the three shim classes from `mellea/backends/adapters/__init__.py` and definitions
+- Update changelog
+- **Rewrite `docs/dev/intrinsics_and_adapters.md`** — the current doc describes the old four-class hierarchy (`IntrinsicAdapter`, `EmbeddedIntrinsicAdapter`, etc.). Rewrite (not edit) to document the final `Adapter` + `WeightsBinding` + `IOContract` model. The design proposal §15 explicitly flags this as "rewrite, not edit."
+- **Three tutorials (§15)** — write alongside or immediately after shim removal, when the full API is stable:
+ - "Adding a custom intrinsic in 20 lines" — replaces the `CustomIntrinsicAdapter` monkey-patch story
+ - "Handling a breaking schema change without breaking users" — `requirement-check` v1→v2 worked example; HF revision pinning and `AdapterSchemaMismatchError`
+ - "Reading intrinsic telemetry" — short dashboard-building guide referencing `parse_failures`, `phase_duration_ms`, and the span tree
+
+### Out of scope
+
+- Any new functionality
+- The post-"Intrinsic" name rename (sequenced when IBM confirms — separate issue)
+
+### Acceptance criteria
+
+- [ ] Shim classes removed
+- [ ] No internal references to the old class names remain (`grep` gate in test plan)
+- [ ] Changelog entry recording the removal with migration note (import path change)
+- [ ] `docs/dev/intrinsics_and_adapters.md` rewritten; describes `Adapter` + `WeightsBinding` + `IOContract`; no references to deleted class names
+- [ ] At least two of the three tutorials written and linked from `docs/docs/advanced/intrinsics.md`
+- [ ] All tutorials' code examples validated against current source (pass `uv run pytest` on any embedded examples)
+- [ ] markdownlint clean on all touched docs
+- [ ] Existing tests pass (callers should already be on the new types)
+
+### Test plan
+
+- `grep` for old class names — should return no hits in `mellea/` (gate in CI)
+- `grep` for old class names in `docs/` — no hits in the rewritten docs
+- Existing test suite passes
+- Docs validation: `npx markdownlint-cli2 "docs/dev/intrinsics_and_adapters.md"` clean
+- Tutorial code examples: run via `uv run pytest` (mark examples with `# pytest: e2e, hf` or appropriate markers)
+
+### Risks & Mitigations
+
+| Risk | Mitigation |
+|---|---|
+| External users still depending on old class names break on update | Deprecation window honoured (1 minor release minimum); changelog entry; window extendable per Q4 |
+| Removal coincides with IBM's "Intrinsic" rename and conflates two changes | Decision called out in PR description: fold rename in only if IBM has confirmed by then; otherwise keep separate |
+| Tests still reference old class names somewhere | `grep` gate in acceptance criteria catches |
+
+### Breaking Changes
+
+- **Removal of `IntrinsicAdapter`, `EmbeddedIntrinsicAdapter`, `CustomIntrinsicAdapter`.** Anyone still importing these breaks. By design, after the deprecation window.
+
+### Impinging Issues / PRs
+
+- All Phase 1 issues — must be merged for callers to be off the old classes
+- All Phase 2 issues — must be merged for backend surface to be on the new types
+- IBM "Intrinsic" rename decision — orthogonal, may or may not coincide
+
+### References
+
+- PR #1080 §16 Phase 4; Part I §5 Q4
+
+---
+
+# Cross-references summary
+
+After filing, every issue body should include a "Related" section linking the parent (`#929`) and any blocking/blocked-by siblings explicitly. The dependency table at the top of this document is the source of truth — when filing, copy the relevant row's edges into each issue body.
+
+GitHub sub-issues note: per memory `reference_github_subissues`, REST POST is broken; use the GraphQL `addSubIssue` mutation to wire each child to #929 as a formal sub-issue.
+
+# Open issues / PRs to coordinate before filing
+
+These are active open issues or PRs in the main repo that overlap with epic #929 work. They must be resolved or coordinated before the corresponding sub-issues are started to avoid conflicts.
+
+| Item | Overlap | Action |
+|---|---|---|
+| **PR #935** — docs: migrate Guardian documentation from deprecated GuardianCheck to Intrinsics API | Large open docs PR touching `docs/docs/advanced/intrinsics.md`, AGENTS.md, guardian examples — same files as 1.D/2.2/2.3. Rebased onto main 2026-05-19. | **Merge before starting 1.D.** Our subsequent issues build on top of this docs baseline. If it won't merge before work starts, coordinate to avoid conflicts. |
+| **PR #1078** — fix: intrinsic tests and add safeguards for future adapter changes (fixes #1029) | Adds canned I/O tests + `last_validated_commit` for `requirement-check` and `uncertainty` formatters (under `test/formatters/granite/`). | **Merge before starting 0.2/1.C.** Our revision-pinning (0.2) supersedes the `last_validated_commit` approach; 1.C builds on the formatter test data this PR adds. Close #1029 when #1078 merges. |
+| **#1094** — Migrate creating_a_new_type_of_session.py example off deprecated GuardianCheck | Session example imports deprecated `GuardianCheck`; same problem space as 1.D. | **Close with 1.D** — include the example migration in 1.D's scope. Add to 1.D impinging issues. |
+| **#1071** — feat: intrinsic-backed Requirement subclass for Guardian safety validation | New feature that would add code in the old guardian style | **Gate on 1.D** — do not start #1071 before 1.D merges. When #1071 is picked up, it should use the new `Adapter` types from Phase 0/1. Reference in 1.D's impinging issues. |
+| **#1029** — update intrinsic tests | Addressed by PR #1078; 0.2+1.C supersede the `last_validated_commit` approach with revision-pinning. | Close when #1078 merges. No separate action needed. |
+
+# Filing checklist
+
+- [ ] **Pre-flight:** PR #935 merged (or conflict-plan in place)
+- [ ] **Pre-flight:** PR #1078 merged; #1029 closed
+- [ ] Wave 1: 0.1, 0.2 — file simultaneously
+- [ ] Wave 2: 1.A — file after 0.1 has a draft PR
+- [ ] Wave 3: 1.B, 1.C, 1.D + 2.1 — file after 1.A merges; 1.D must note #1094 (close with this issue) and #1071 (gate on this issue)
+- [ ] Wave 4: 2.2, 2.3 — file after 2.1 merges
+- [ ] Tracking: 3.1 and 4.1 — file at any point, mark blocked
+- [ ] Wire #1111 as sub-issue of #929 (already filed)
+- [ ] After every issue is filed, add as sub-issue of #929 via GraphQL `addSubIssue`
+- [ ] Update the proposal branch's appendix to reference filed issue numbers (or skip — once PR #1080 closes, the branch is historical)
+