Define AI agents as contracts, not scattered prompts.
Open Agent Spec lets you define an agent once in YAML, validate inputs and outputs against a schema, and either run it directly with oa run or generate a Python scaffold with oa init.
Most agent systems are hard to reason about:
- outputs are not strictly typed
- behaviour is buried in prompts
- logic is split across Python, Markdown, and framework abstractions
- swapping models often breaks things in subtle ways
Open Agent Spec treats an agent like infrastructure.
Think OpenAPI or Terraform, but for AI agents.
You define:
- input schema
- output schema
- prompts
- model configuration
Then OA enforces the boundary:
input -> LLM -> validated output
If the output does not match schema, the task fails fast with a validation error.
For example, this shape mismatch can silently break downstream systems:
{"msg":"hello"}instead of:
{"response":"hello"}Install (Python 3.10+):
pipx install open-agent-specoa init aac
oa validate aac
export OPENAI_API_KEY=your_key_here
oa run --spec .agents/example.yaml --task greet --input '{"name":"Alice"}' --quietWith OA you can:
- define tasks, prompts, model config, and expected I/O in YAML
- run a spec directly without generating code first
- keep
.agents/*.yamlin your repo and call them from CI - generate a Python project scaffold when you want to customize implementation
Shortest path from install to a working agent:
1. Create the agents-as-code layout (aac = repo-native .agents/ directory):
oa init aacThis creates:
.agents/
├── example.yaml # minimal hello-world spec
├── review.yaml # code-review agent that accepts a diff file
├── change.diff # sample diff for immediate review-agent testing
└── README.md # quick usage notes
2. Validate the generated specs:
oa validate aac3. Set an API key for the engine in your spec (OpenAI by default):
export OPENAI_API_KEY=your_key_here4. Run the example agent:
oa run --spec .agents/example.yaml --task greet --input '{"name":"Alice"}' --quiet--quiet prints the task output JSON only, good for piping to jq or scripting:
{
"response": "Hello Alice!"
}Omit --quiet for the full execution envelope with Rich formatting.
5. Run the review agent with the bundled sample diff:
oa run --spec .agents/review.yaml --task review --input .agents/change.diff --quietOr review your own change:
git diff > change.diff
oa run --spec .agents/review.yaml --task review --input change.diff --quietStart from this shape:
open_agent_spec: "1.5.0"
agent:
name: hello-world-agent
role: chat
intelligence:
type: llm
engine: openai
model: gpt-4o
tasks:
greet:
description: Say hello to someone
input:
type: object
properties:
name:
type: string
required: [name]
output:
type: object
properties:
response:
type: string
required: [response]
prompts:
system: >
You greet people by name.
user: "{{ name }}"Validate first, then run:
oa validate --spec agent.yaml
oa run --spec agent.yaml --task greet --input '{"name":"Alice"}' --quietChain tasks declaratively. OA merges upstream outputs into downstream inputs automatically — no glue code required.
tasks:
extract:
description: Pull key facts from raw text.
# ... input / output / prompts
summarise:
description: Summarise the extracted facts.
depends_on: [extract] # extract's output is merged into summarise's input
# ... promptsdepends_on is a data contract, not execution control. OA has no branching, loops, or conditionals by design. See examples/multi-task/.
Let the model call tools declared in the spec. Three backends, zero SDK dependencies.
tools:
reader:
type: native
native: file.read # built-in: file.read/write, http.get/post, env.read
search:
type: mcp
endpoint: http://localhost:3000 # any MCP server (JSON-RPC 2.0 over HTTP)
classifier:
type: custom
module: my_pkg.tools:ClassifierTool # your own Python class
tasks:
analyse:
tools: [reader, search, classifier]
# ...See examples/file-reader/ and examples/mcp-search/.
A task can hand off its implementation to another spec entirely. Great for building shared specialist agents that many pipelines reuse.
tasks:
sentiment_of_summary:
description: Delegate to the shared sentiment specialist.
spec: ./shared/sentiment.yaml # local path or oa:// registry URL
task: analyse_sentiment
depends_on: [summarise] # upstream outputs merged in automaticallySee examples/spec-composition/.
Publish and consume specs from the hosted registry at openagentspec.dev/registry/. Reference them with the oa:// shorthand — the runner resolves and fetches them automatically.
tasks:
review:
spec: oa://prime-vector/code-reviewer # resolves to latest hosted spec
task: reviewBrowse the registry at openagentspec.dev/registry. Available specs: summariser, classifier, sentiment, code-reviewer, keyword-extractor, memory-retriever.
Pass prior conversation turns as a history input field. OA injects them into the LLM message list between system and user turns. OA never stores history — your application manages the list.
tasks:
chat:
input:
type: object
properties:
message: {type: string}
history:
type: array
description: Prior turns injected by the caller. OA never writes to this field.oa run --spec spec.yaml --task chat \
--input '{"message":"What did I just say?","history":[{"role":"user","content":"Hello"},{"role":"assistant","content":"Hi there!"}]}'See examples/chat-agent/.
Your application fetches candidate turns from an external store. The memory-retriever registry spec uses an LLM to select the most relevant ones and returns them as a history array ready to inject into any chat task.
tasks:
recall:
spec: oa://prime-vector/memory-retriever
task: retrieve # input: query + candidates → output: history + memory_count
respond:
depends_on: [recall]
spec: ./chat-agent/spec.yaml
task: chatDeclare hard execution constraints in the spec. The runner enforces them before any tool call reaches the I/O layer — no network connection opened, no file handle created, no exception to catch.
sandbox:
tools:
allow: [file.read, http.get] # SANDBOX_TOOL_VIOLATION if anything else is called
http:
allow_domains: [api.example.com] # SANDBOX_DOMAIN_VIOLATION for other hosts
file:
allow_paths: [./data/] # SANDBOX_PATH_VIOLATION for paths outside this prefix
tasks:
restricted:
sandbox: # per-task override tightens the root sandbox
tools:
allow: [file.read]See examples/sandboxed-agent/.
Declare what the model output must contain. The behavioural-contracts library enforces the contract after parsing, before the result is returned.
behavioural_contract:
version: "1.0"
response_contract:
output_format:
required_fields: [confidence] # CONTRACT_VIOLATION if missing
tasks:
classify:
behavioural_contract:
response_contract:
output_format:
required_fields: [label] # effective required_fields: [confidence, label]Install: pip install 'open-agent-spec[contracts]'
Switch models by changing one line. All engines except Anthropic and Codex speak the OpenAI Chat Completions API over raw HTTP — no SDK required.
intelligence:
type: llm
engine: openai # openai | anthropic | grok | xai | cortex | local | codex | custom
model: gpt-4o-miniRun OA specs from Node.js without Python.
npm install -g @prime-vector/open-agent-spec
oa-run --spec agent.yaml --task greet --input '{"name":"Alice"}'Supports OpenAI and Anthropic, depends_on chains, and history threading.
If you want editable generated code instead of running the YAML directly:
oa init --spec agent.yaml --output ./agentGenerated structure:
agent/
├── agent.py
├── models.py
├── prompts/
├── requirements.txt
├── .env.example
└── README.md
Most agent projects end up hand-rolling the same pieces:
- prompt templates
- model configuration
- task definitions
- routing glue
- runtime wrappers
OA moves those concerns into a declarative spec so they can be reviewed, versioned, and reused.
The intended model is:
- spec defines the agent contract
oa runexecutes the spec directlyoa initgenerates a starting implementation when you need code- external systems can orchestrate multiple specs however they want
OA deliberately does not prescribe:
- orchestration
- evaluation
- governance
- long-running runtime architecture
| Command | Purpose |
|---|---|
oa init aac |
Create .agents/ with starter specs |
oa validate aac |
Validate all specs in .agents/ |
oa validate --spec agent.yaml |
Validate one spec |
oa test agent.test.yaml |
Run YAML eval cases (model + assertions on task output); --quiet for CI JSON |
oa run --spec agent.yaml --task greet --input '{"name":"Alice"}' --quiet |
Run one task directly from YAML |
oa init --spec agent.yaml --output ./agent |
Generate a Python scaffold |
oa update --spec agent.yaml --output ./agent |
Regenerate an existing scaffold |
The formal specification defines what a conforming OA runtime must do, independent of any specific implementation.
| Resource | Contents |
|---|---|
| spec/open-agent-spec-1.5.md | Formal specification — normative MUST/SHOULD/MAY requirements for OA 1.5.0 |
| spec/schema/oas-schema-1.5.json | Canonical JSON Schema for validating spec documents |
| spec/conformance/README.md | Conformance test structure and contribution guide |
An independent implementor can build a conforming runtime from spec/open-agent-spec-1.5.md alone.
| Resource | Contents |
|---|---|
| openagentspec.dev | Project website |
| docs/REFERENCE.md | Spec structure, engines, templates, .agents/ usage |
| examples/multi-agent | Multi-agent orchestration example — manager, workers, task board, dashboard |
| Repository | Source, issues, workflows |
- The CLI command is
oa(notoas). - Python 3.10+ is required.
oa runrequires the relevant provider API key for the engine in your spec.
- OA Open Agent Spec was dreamed up by Andrew Whitehouse in late 2024, with a desire to give structure and standardisation to early agent systems
- In early 2025 Prime Vector was formed taking over the public facing project
MIT | see LICENSE.
