Skip to content

prime-vector/open-agent-spec

Open Agent Spec (OA)

Define AI agents as contracts, not scattered prompts.

PyPI version Python License

Open Agent Spec lets you define an agent once in YAML, validate inputs and outputs against a schema, and either run it directly with oa run or generate a Python scaffold with oa init.

Why This Exists

Most agent systems are hard to reason about:

  • outputs are not strictly typed
  • behaviour is buried in prompts
  • logic is split across Python, Markdown, and framework abstractions
  • swapping models often breaks things in subtle ways

The Idea

Open Agent Spec treats an agent like infrastructure.

Think OpenAPI or Terraform, but for AI agents.

You define:

  • input schema
  • output schema
  • prompts
  • model configuration

Then OA enforces the boundary:

input -> LLM -> validated output

If the output does not match schema, the task fails fast with a validation error.

For example, this shape mismatch can silently break downstream systems:

{"msg":"hello"}

instead of:

{"response":"hello"}

Agents as Code — OA init spec, spec run, LLM execution, tasks executed

Super Quick Start

Install (Python 3.10+):

pipx install open-agent-spec
oa init aac
oa validate aac
export OPENAI_API_KEY=your_key_here
oa run --spec .agents/example.yaml --task greet --input '{"name":"Alice"}' --quiet

With OA you can:

  • define tasks, prompts, model config, and expected I/O in YAML
  • run a spec directly without generating code first
  • keep .agents/*.yaml in your repo and call them from CI
  • generate a Python project scaffold when you want to customize implementation

First Run

Shortest path from install to a working agent:

1. Create the agents-as-code layout (aac = repo-native .agents/ directory):

oa init aac

This creates:

.agents/
├── example.yaml   # minimal hello-world spec
├── review.yaml    # code-review agent that accepts a diff file
├── change.diff    # sample diff for immediate review-agent testing
└── README.md      # quick usage notes

2. Validate the generated specs:

oa validate aac

3. Set an API key for the engine in your spec (OpenAI by default):

export OPENAI_API_KEY=your_key_here

4. Run the example agent:

oa run --spec .agents/example.yaml --task greet --input '{"name":"Alice"}' --quiet

--quiet prints the task output JSON only, good for piping to jq or scripting:

{
  "response": "Hello Alice!"
}

Omit --quiet for the full execution envelope with Rich formatting.

5. Run the review agent with the bundled sample diff:

oa run --spec .agents/review.yaml --task review --input .agents/change.diff --quiet

Or review your own change:

git diff > change.diff
oa run --spec .agents/review.yaml --task review --input change.diff --quiet

Write Your Own Spec

Start from this shape:

open_agent_spec: "1.5.0"

agent:
  name: hello-world-agent
  role: chat

intelligence:
  type: llm
  engine: openai
  model: gpt-4o

tasks:
  greet:
    description: Say hello to someone
    input:
      type: object
      properties:
        name:
          type: string
      required: [name]
    output:
      type: object
      properties:
        response:
          type: string
      required: [response]

prompts:
  system: >
    You greet people by name.
  user: "{{ name }}"

Validate first, then run:

oa validate --spec agent.yaml
oa run --spec agent.yaml --task greet --input '{"name":"Alice"}' --quiet

Features

Multi-task pipelines with depends_on

Chain tasks declaratively. OA merges upstream outputs into downstream inputs automatically — no glue code required.

tasks:
  extract:
    description: Pull key facts from raw text.
    # ... input / output / prompts

  summarise:
    description: Summarise the extracted facts.
    depends_on: [extract]   # extract's output is merged into summarise's input
    # ... prompts

depends_on is a data contract, not execution control. OA has no branching, loops, or conditionals by design. See examples/multi-task/.


Tools — native, MCP, and custom

Let the model call tools declared in the spec. Three backends, zero SDK dependencies.

tools:
  reader:
    type: native
    native: file.read          # built-in: file.read/write, http.get/post, env.read

  search:
    type: mcp
    endpoint: http://localhost:3000   # any MCP server (JSON-RPC 2.0 over HTTP)

  classifier:
    type: custom
    module: my_pkg.tools:ClassifierTool   # your own Python class

tasks:
  analyse:
    tools: [reader, search, classifier]
    # ...

See examples/file-reader/ and examples/mcp-search/.


Spec composition — delegate tasks to other specs

A task can hand off its implementation to another spec entirely. Great for building shared specialist agents that many pipelines reuse.

tasks:
  sentiment_of_summary:
    description: Delegate to the shared sentiment specialist.
    spec: ./shared/sentiment.yaml   # local path or oa:// registry URL
    task: analyse_sentiment
    depends_on: [summarise]         # upstream outputs merged in automatically

See examples/spec-composition/.


Spec Registry — share specs via oa://

Publish and consume specs from the hosted registry at openagentspec.dev/registry/. Reference them with the oa:// shorthand — the runner resolves and fetches them automatically.

tasks:
  review:
    spec: oa://prime-vector/code-reviewer   # resolves to latest hosted spec
    task: review

Browse the registry at openagentspec.dev/registry. Available specs: summariser, classifier, sentiment, code-reviewer, keyword-extractor, memory-retriever.


History threading — stateless multi-turn chat

Pass prior conversation turns as a history input field. OA injects them into the LLM message list between system and user turns. OA never stores history — your application manages the list.

tasks:
  chat:
    input:
      type: object
      properties:
        message: {type: string}
        history:
          type: array
          description: Prior turns injected by the caller. OA never writes to this field.
oa run --spec spec.yaml --task chat \
  --input '{"message":"What did I just say?","history":[{"role":"user","content":"Hello"},{"role":"assistant","content":"Hi there!"}]}'

See examples/chat-agent/.


Memory retriever — LLM re-ranker for long-term memory

Your application fetches candidate turns from an external store. The memory-retriever registry spec uses an LLM to select the most relevant ones and returns them as a history array ready to inject into any chat task.

tasks:
  recall:
    spec: oa://prime-vector/memory-retriever
    task: retrieve   # input: query + candidates → output: history + memory_count

  respond:
    depends_on: [recall]
    spec: ./chat-agent/spec.yaml
    task: chat

See examples/memory-chat/.


Immutable Inference Sandboxing (IIS)

Declare hard execution constraints in the spec. The runner enforces them before any tool call reaches the I/O layer — no network connection opened, no file handle created, no exception to catch.

sandbox:
  tools:
    allow: [file.read, http.get]     # SANDBOX_TOOL_VIOLATION if anything else is called
  http:
    allow_domains: [api.example.com] # SANDBOX_DOMAIN_VIOLATION for other hosts
  file:
    allow_paths: [./data/]           # SANDBOX_PATH_VIOLATION for paths outside this prefix

tasks:
  restricted:
    sandbox:                         # per-task override tightens the root sandbox
      tools:
        allow: [file.read]

See examples/sandboxed-agent/.


Behavioural contracts

Declare what the model output must contain. The behavioural-contracts library enforces the contract after parsing, before the result is returned.

behavioural_contract:
  version: "1.0"
  response_contract:
    output_format:
      required_fields: [confidence]   # CONTRACT_VIOLATION if missing

tasks:
  classify:
    behavioural_contract:
      response_contract:
        output_format:
          required_fields: [label]    # effective required_fields: [confidence, label]

Install: pip install 'open-agent-spec[contracts]'


Multiple engines

Switch models by changing one line. All engines except Anthropic and Codex speak the OpenAI Chat Completions API over raw HTTP — no SDK required.

intelligence:
  type: llm
  engine: openai       # openai | anthropic | grok | xai | cortex | local | codex | custom
  model: gpt-4o-mini

npm / Node.js CLI

Run OA specs from Node.js without Python.

npm install -g @prime-vector/open-agent-spec
oa-run --spec agent.yaml --task greet --input '{"name":"Alice"}'

Supports OpenAI and Anthropic, depends_on chains, and history threading.


Generate a Python Scaffold

If you want editable generated code instead of running the YAML directly:

oa init --spec agent.yaml --output ./agent

Generated structure:

agent/
├── agent.py
├── models.py
├── prompts/
├── requirements.txt
├── .env.example
└── README.md

Core Idea

Most agent projects end up hand-rolling the same pieces:

  • prompt templates
  • model configuration
  • task definitions
  • routing glue
  • runtime wrappers

OA moves those concerns into a declarative spec so they can be reviewed, versioned, and reused.

The intended model is:

  • spec defines the agent contract
  • oa run executes the spec directly
  • oa init generates a starting implementation when you need code
  • external systems can orchestrate multiple specs however they want

OA deliberately does not prescribe:

  • orchestration
  • evaluation
  • governance
  • long-running runtime architecture

Common Commands

Command Purpose
oa init aac Create .agents/ with starter specs
oa validate aac Validate all specs in .agents/
oa validate --spec agent.yaml Validate one spec
oa test agent.test.yaml Run YAML eval cases (model + assertions on task output); --quiet for CI JSON
oa run --spec agent.yaml --task greet --input '{"name":"Alice"}' --quiet Run one task directly from YAML
oa init --spec agent.yaml --output ./agent Generate a Python scaffold
oa update --spec agent.yaml --output ./agent Regenerate an existing scaffold

Specification

The formal specification defines what a conforming OA runtime must do, independent of any specific implementation.

Resource Contents
spec/open-agent-spec-1.5.md Formal specification — normative MUST/SHOULD/MAY requirements for OA 1.5.0
spec/schema/oas-schema-1.5.json Canonical JSON Schema for validating spec documents
spec/conformance/README.md Conformance test structure and contribution guide

An independent implementor can build a conforming runtime from spec/open-agent-spec-1.5.md alone.

More Detail

Resource Contents
openagentspec.dev Project website
docs/REFERENCE.md Spec structure, engines, templates, .agents/ usage
examples/multi-agent Multi-agent orchestration example — manager, workers, task board, dashboard
Repository Source, issues, workflows

Notes

  • The CLI command is oa (not oas).
  • Python 3.10+ is required.
  • oa run requires the relevant provider API key for the engine in your spec.

About

  • OA Open Agent Spec was dreamed up by Andrew Whitehouse in late 2024, with a desire to give structure and standardisation to early agent systems
  • In early 2025 Prime Vector was formed taking over the public facing project

License

MIT | see LICENSE.

Open Agent Stack

About

Open Agent Spec is a declarative YAML standard and CLI for defining and generating AI agents. One spec, any LLM engine (OpenAI, Anthropic, Grok, Cortex, local, custom).

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors