Open Agent Spec (OA)

Define AI agents as contracts, not scattered prompts.

Open Agent Spec lets you define an agent once in YAML, validate inputs and outputs against a schema, and either run it directly with oa run or generate a Python scaffold with oa init.

Why This Exists

Most agent systems are hard to reason about:

outputs are not strictly typed
behaviour is buried in prompts
logic is split across Python, Markdown, and framework abstractions
swapping models often breaks things in subtle ways

The Idea

Open Agent Spec treats an agent like infrastructure.

Think OpenAPI or Terraform, but for AI agents.

You define:

input schema
output schema
prompts
model configuration

Then OA enforces the boundary:

input -> LLM -> validated output

If the output does not match schema, the task fails fast with a validation error.

For example, this shape mismatch can silently break downstream systems:

{"msg":"hello"}

instead of:

{"response":"hello"}

Super Quick Start

Install (Python 3.10+):

pipx install open-agent-spec

oa init aac
oa validate aac
export OPENAI_API_KEY=your_key_here
oa run --spec .agents/example.yaml --task greet --input '{"name":"Alice"}' --quiet

With OA you can:

define tasks, prompts, model config, and expected I/O in YAML
run a spec directly without generating code first
keep .agents/*.yaml in your repo and call them from CI
generate a Python project scaffold when you want to customize implementation

First Run

Shortest path from install to a working agent:

1. Create the agents-as-code layout (aac = repo-native .agents/ directory):

oa init aac

This creates:

.agents/
├── example.yaml   # minimal hello-world spec
├── review.yaml    # code-review agent that accepts a diff file
├── change.diff    # sample diff for immediate review-agent testing
└── README.md      # quick usage notes

2. Validate the generated specs:

oa validate aac

3. Set an API key for the engine in your spec (OpenAI by default):

export OPENAI_API_KEY=your_key_here

4. Run the example agent:

oa run --spec .agents/example.yaml --task greet --input '{"name":"Alice"}' --quiet

--quiet prints the task output JSON only, good for piping to jq or scripting:

{
  "response": "Hello Alice!"
}

Omit --quiet for the full execution envelope with Rich formatting.

5. Run the review agent with the bundled sample diff:

oa run --spec .agents/review.yaml --task review --input .agents/change.diff --quiet

Or review your own change:

git diff > change.diff
oa run --spec .agents/review.yaml --task review --input change.diff --quiet

Write Your Own Spec

Start from this shape:

open_agent_spec: "1.5.0"

agent:
  name: hello-world-agent
  role: chat

intelligence:
  type: llm
  engine: openai
  model: gpt-4o

tasks:
  greet:
    description: Say hello to someone
    input:
      type: object
      properties:
        name:
          type: string
      required: [name]
    output:
      type: object
      properties:
        response:
          type: string
      required: [response]

prompts:
  system: >
    You greet people by name.
  user: "{{ name }}"

Validate first, then run:

oa validate --spec agent.yaml
oa run --spec agent.yaml --task greet --input '{"name":"Alice"}' --quiet

Features

Multi-task pipelines with `depends_on`

Chain tasks declaratively. OA merges upstream outputs into downstream inputs automatically — no glue code required.

tasks:
  extract:
    description: Pull key facts from raw text.
    # ... input / output / prompts

  summarise:
    description: Summarise the extracted facts.
    depends_on: [extract]   # extract's output is merged into summarise's input
    # ... prompts

depends_on is a data contract, not execution control. OA has no branching, loops, or conditionals by design. See examples/multi-task/.

Tools — native, MCP, and custom

Let the model call tools declared in the spec. Three backends, zero SDK dependencies.

tools:
  reader:
    type: native
    native: file.read          # built-in: file.read/write, http.get/post, env.read

  search:
    type: mcp
    endpoint: http://localhost:3000   # any MCP server (JSON-RPC 2.0 over HTTP)

  classifier:
    type: custom
    module: my_pkg.tools:ClassifierTool   # your own Python class

tasks:
  analyse:
    tools: [reader, search, classifier]
    # ...

See examples/file-reader/ and examples/mcp-search/.

Spec composition — delegate tasks to other specs

A task can hand off its implementation to another spec entirely. Great for building shared specialist agents that many pipelines reuse.

tasks:
  sentiment_of_summary:
    description: Delegate to the shared sentiment specialist.
    spec: ./shared/sentiment.yaml   # local path or oa:// registry URL
    task: analyse_sentiment
    depends_on: [summarise]         # upstream outputs merged in automatically

See examples/spec-composition/.

Spec Registry — share specs via `oa://`

Publish and consume specs from the hosted registry at openagentspec.dev/registry/. Reference them with the oa:// shorthand — the runner resolves and fetches them automatically.

tasks:
  review:
    spec: oa://prime-vector/code-reviewer   # resolves to latest hosted spec
    task: review

Browse the registry at openagentspec.dev/registry. Available specs: summariser, classifier, sentiment, code-reviewer, keyword-extractor, memory-retriever.

History threading — stateless multi-turn chat

Pass prior conversation turns as a history input field. OA injects them into the LLM message list between system and user turns. OA never stores history — your application manages the list.

tasks:
  chat:
    input:
      type: object
      properties:
        message: {type: string}
        history:
          type: array
          description: Prior turns injected by the caller. OA never writes to this field.

oa run --spec spec.yaml --task chat \
  --input '{"message":"What did I just say?","history":[{"role":"user","content":"Hello"},{"role":"assistant","content":"Hi there!"}]}'

See examples/chat-agent/.

Memory retriever — LLM re-ranker for long-term memory

Your application fetches candidate turns from an external store. The memory-retriever registry spec uses an LLM to select the most relevant ones and returns them as a history array ready to inject into any chat task.

tasks:
  recall:
    spec: oa://prime-vector/memory-retriever
    task: retrieve   # input: query + candidates → output: history + memory_count

  respond:
    depends_on: [recall]
    spec: ./chat-agent/spec.yaml
    task: chat

See examples/memory-chat/.

Immutable Inference Sandboxing (IIS)

Declare hard execution constraints in the spec. The runner enforces them before any tool call reaches the I/O layer — no network connection opened, no file handle created, no exception to catch.

sandbox:
  tools:
    allow: [file.read, http.get]     # SANDBOX_TOOL_VIOLATION if anything else is called
  http:
    allow_domains: [api.example.com] # SANDBOX_DOMAIN_VIOLATION for other hosts
  file:
    allow_paths: [./data/]           # SANDBOX_PATH_VIOLATION for paths outside this prefix

tasks:
  restricted:
    sandbox:                         # per-task override tightens the root sandbox
      tools:
        allow: [file.read]

See examples/sandboxed-agent/.

Behavioural contracts

Declare what the model output must contain. The behavioural-contracts library enforces the contract after parsing, before the result is returned.

behavioural_contract:
  version: "1.0"
  response_contract:
    output_format:
      required_fields: [confidence]   # CONTRACT_VIOLATION if missing

tasks:
  classify:
    behavioural_contract:
      response_contract:
        output_format:
          required_fields: [label]    # effective required_fields: [confidence, label]

Install: pip install 'open-agent-spec[contracts]'

Multiple engines

Switch models by changing one line. All engines except Anthropic and Codex speak the OpenAI Chat Completions API over raw HTTP — no SDK required.

intelligence:
  type: llm
  engine: openai       # openai | anthropic | grok | xai | cortex | local | codex | custom
  model: gpt-4o-mini

npm / Node.js CLI

Run OA specs from Node.js without Python.

npm install -g @prime-vector/open-agent-spec
oa-run --spec agent.yaml --task greet --input '{"name":"Alice"}'

Supports OpenAI and Anthropic, depends_on chains, and history threading.

Generate a Python Scaffold

If you want editable generated code instead of running the YAML directly:

oa init --spec agent.yaml --output ./agent

Generated structure:

agent/
├── agent.py
├── models.py
├── prompts/
├── requirements.txt
├── .env.example
└── README.md

Core Idea

Most agent projects end up hand-rolling the same pieces:

prompt templates
model configuration
task definitions
routing glue
runtime wrappers

OA moves those concerns into a declarative spec so they can be reviewed, versioned, and reused.

The intended model is:

spec defines the agent contract
oa run executes the spec directly
oa init generates a starting implementation when you need code
external systems can orchestrate multiple specs however they want

OA deliberately does not prescribe:

orchestration
evaluation
governance
long-running runtime architecture

Common Commands

Command	Purpose
`oa init aac`	Create `.agents/` with starter specs
`oa validate aac`	Validate all specs in `.agents/`
`oa validate --spec agent.yaml`	Validate one spec
`oa test agent.test.yaml`	Run YAML eval cases (model + assertions on task output); `--quiet` for CI JSON
`oa run --spec agent.yaml --task greet --input '{"name":"Alice"}' --quiet`	Run one task directly from YAML
`oa init --spec agent.yaml --output ./agent`	Generate a Python scaffold
`oa update --spec agent.yaml --output ./agent`	Regenerate an existing scaffold

Specification

The formal specification defines what a conforming OA runtime must do, independent of any specific implementation.

Resource	Contents
spec/open-agent-spec-1.5.md	Formal specification — normative MUST/SHOULD/MAY requirements for OA 1.5.0
spec/schema/oas-schema-1.5.json	Canonical JSON Schema for validating spec documents
spec/conformance/README.md	Conformance test structure and contribution guide

An independent implementor can build a conforming runtime from spec/open-agent-spec-1.5.md alone.

More Detail

Resource	Contents
openagentspec.dev	Project website
docs/REFERENCE.md	Spec structure, engines, templates, `.agents/` usage
examples/multi-agent	Multi-agent orchestration example — manager, workers, task board, dashboard
Repository	Source, issues, workflows

Notes

The CLI command is oa (not oas).
Python 3.10+ is required.
oa run requires the relevant provider API key for the engine in your spec.

About

OA Open Agent Spec was dreamed up by Andrew Whitehouse in late 2024, with a desire to give structure and standardisation to early agent systems
In early 2025 Prime Vector was formed taking over the public facing project

License

MIT | see LICENSE.

Open Agent Stack

Name		Name	Last commit message	Last commit date
Latest commit History 468 Commits
.agents		.agents
.github		.github
Website		Website
docs		docs
examples		examples
npm		npm
oas_cli		oas_cli
spec		spec
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
NOTICE		NOTICE
OAAAC.png		OAAAC.png
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open Agent Spec (OA)

Why This Exists

The Idea

Super Quick Start

First Run

Write Your Own Spec

Features

Multi-task pipelines with `depends_on`

Tools — native, MCP, and custom

Spec composition — delegate tasks to other specs

Spec Registry — share specs via `oa://`

History threading — stateless multi-turn chat

Memory retriever — LLM re-ranker for long-term memory

Immutable Inference Sandboxing (IIS)

Behavioural contracts

Multiple engines

npm / Node.js CLI

Generate a Python Scaffold

Core Idea

Common Commands

Specification

More Detail

Notes

About

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Open Agent Spec (OA)

Why This Exists

The Idea

Super Quick Start

First Run

Write Your Own Spec

Features

Multi-task pipelines with depends_on

Tools — native, MCP, and custom

Spec composition — delegate tasks to other specs

Spec Registry — share specs via oa://

History threading — stateless multi-turn chat

Memory retriever — LLM re-ranker for long-term memory

Immutable Inference Sandboxing (IIS)

Behavioural contracts

Multiple engines

npm / Node.js CLI

Generate a Python Scaffold

Core Idea

Common Commands

Specification

More Detail

Notes

About

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Multi-task pipelines with `depends_on`

Spec Registry — share specs via `oa://`

Packages