Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 126 additions & 0 deletions docs/mcp/autonomous_agent_guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Autonomous Agent Guide for PlanExe

This guide is for AI agents (Claude, GPT, Codex, etc.) that consume PlanExe via MCP. It covers the end-to-end workflow for autonomous plan creation without human intervention.

## When to use PlanExe

Use PlanExe when an agent needs to plan a **multi-phase project** with real-world constraints — the kind of thing that takes weeks to months. Examples:

- Launching a product or service
- Setting up infrastructure (factory, clinic, data center)
- Organizational change (governance, team restructuring)
- Policy implementation at scale

**Do not use PlanExe** for simple tasks (checklists, single-step operations, code generation). Use a normal LLM response for those.

## Autonomous workflow

### Step 1: Discover capabilities

```
Call: example_prompts
```

Read the returned examples to understand what a good prompt looks like. PlanExe prompts are flowing prose (~300-800 words), not structured markdown.

### Step 2: Draft a strong prompt

Before calling `plan_create`, draft a prompt that covers:

| Dimension | What to include |
|-----------|----------------|
| **Objective** | What the project achieves. Be specific. |
| **Scope** | What's included and excluded. Geographic/temporal bounds. |
| **Constraints** | Budget range, timeline, regulatory requirements, technical limits. |
| **Stakeholders** | Who's involved — team, beneficiaries, regulators, funders. |
| **Resources** | Available budget, team size, existing infrastructure. |
| **Success criteria** | Measurable outcomes. How do you know the project succeeded? |

Write as flowing prose. Weave specs, constraints, and targets naturally into sentences. Do not use markdown headers or bullet lists in the prompt itself.

### Step 3: Select model profile (optional)

```
Call: model_profiles
```

Choose a profile based on quality/speed tradeoff:
- **baseline** — Fast, good for most projects (~10-15 min)
- **premium** — Higher quality, slower (~15-25 min)
- **frontier** — Best quality, slowest

### Step 4: Create the plan

```
Call: plan_create(prompt="...", model_profile="baseline")
```

Returns a `plan_id` (UUID). Store this — you'll need it for all subsequent calls.

### Step 5: Monitor progress

```
Call: plan_status(plan_id="...")
```

Poll every 5 minutes. State transitions:
- `pending` → Plan is queued
- `processing` → Pipeline is running
- `completed` → Report is ready
- `failed` → Terminal error (use `plan_resume` or `plan_retry`)

### Step 6: Handle failures

If a plan fails mid-pipeline:

```
Call: plan_resume(plan_id="...") # Continue from where it stopped
```

If you want a full restart:

```
Call: plan_retry(plan_id="...") # Discard all progress, start fresh
```

### Step 7: Retrieve output

```
Call: plan_file_info(plan_id="...", artifact="report")
```

The `download_url` points to the self-contained HTML report. The `zip` artifact contains all intermediary files (markdown, JSON, CSV).

## Error handling for agents

| Scenario | Action |
|----------|--------|
| `plan_status` returns `failed` | Call `plan_resume` first (preserves progress). If that fails, call `plan_retry`. |
| `plan_status` stays `pending` > 5 min | Worker may be down. Report to user. |
| `plan_status` stays `processing` with no file changes > 20 min | Plan likely stalled. Call `plan_stop`, then `plan_retry`. |
| Lost `plan_id` | Call `plan_list` to recover recent plans. |
| Invalid API key | Error code `INVALID_USER_API_KEY`. Prompt user to check their key. |

## Agent self-planning pattern

An advanced pattern: use PlanExe to plan the agent's own work.

1. Agent receives a complex task from the user
2. Agent calls PlanExe to generate a strategic plan
3. Agent reads the plan's WBS (work breakdown structure) from the zip
4. Agent executes the plan step by step, tracking progress against the WBS

Key files in the zip for agent consumption:
- `018-2-wbs_level1.json` — High-level work packages
- `018-5-wbs_level2.json` — Detailed tasks within each package
- `023-2-wbs_level3.json` — Sub-tasks with effort estimates
- `004-2-pre_project_assessment.json` — Feasibility assessment
- `003-6-distill_assumptions_raw.json` — Key assumptions to validate

## Prompt writing tips for agents

1. **Be specific about geography** — "Copenhagen, Denmark" not "a city"
2. **Include budget ranges** — "EUR 500K-1M" not "reasonable budget"
3. **Set a timeline** — "18-month implementation" not "as soon as possible"
4. **Name the team** — "5-person core team with 3 contractors" not "a team"
5. **Define success** — "500 active users within 6 months" not "good adoption"
1 change: 1 addition & 0 deletions docs/mcp/planexe_mcp_interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ The interface is designed to support:
- Observability: logs, state transitions, and artifacts must be inspectable.
- Concurrency safety: prevent conflicting writes and illegal resume patterns.
- Extensibility: future versions can add plan graph browsing, caching backends, exports.
- Agent-native: the interface is designed for autonomous AI agents as the primary consumer. See the [Autonomous Agent Guide](autonomous_agent_guide.md) for the complete agent workflow.

---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,3 +109,6 @@
{"id": "fe853807-5bfe-4e5b-8071-d6db3c360279", "prompt": "My daily commute from home to work takes 1 hour. My bike is broken and need an alternative plan. I live in Amsterdam, Netherlands.", "tags": ["bike", "traffic", "amsterdam", "netherlands", "personal"]}
{"id": "5c4b4fee-267a-409b-842f-4833d86aa215", "prompt": "Create a business selling emergency preparedness products (food, shelter, power, first-aid, radio) across Europe. Offer compact survival kits priced in EUR, enhancing resilience during crises. Start with one distribution center in Rotterdam, Netherlands. Budget 500k EUR. Timeframe: 18 months.", "tags": ["crisis", "emergency", "europe", "business"]}
{"id": "aa4a78f3-32d7-45ca-9f5a-f3e264eb31d4", "prompt": "Establish and launch a high-quality imported tea e-commerce business targeting the Czech Republic market nationwide. The plan must specifically address key obstacles: navigating the challenge of low operating margins common in the tea business, mitigating the cost or finding alternatives to the required dedicated licensed physical space (for handling/licensing purposes), securing reliable suppliers who offer competitive pricing suitable for a new, small customer (including exploring private label options), and developing a comprehensive and effective marketing strategy from scratch.", "tags": ["food", "business", "czech", "tea", "eshop"]}
{"id": "f4988b26-a846-45b6-9555-52ede44d0238", "prompt": "An AI coding agent needs to plan the migration of a legacy monolithic Java 8 application to a microservices architecture running on Kubernetes. The application serves approximately 50,000 daily active users for an e-commerce platform based in Berlin, Germany. The migration must maintain zero downtime for the production environment throughout the transition. The engineering team consists of 8 backend developers, 2 DevOps engineers, and 1 architect, with an annual infrastructure budget of EUR 400,000. The target timeline is 12 months from planning to full migration, with the first three microservices extracted and running independently within 4 months. Key constraints include GDPR compliance for all data handling, maintaining the existing PostgreSQL database during transition with eventual migration to service-specific databases, and ensuring all existing API contracts remain backwards compatible. Success criteria: 99.9% uptime maintained during migration, response latency under 200ms for 95th percentile requests, all 12 identified bounded contexts running as independent services, and CI/CD pipeline achieving sub-15-minute deployments for any individual service.", "tags": ["agent", "software", "migration", "kubernetes", "business"]}
{"id": "30499a0c-e3f8-4569-a169-470e32086ba0", "prompt": "An autonomous research agent is planning the setup of a new computational biology lab at a mid-sized European university in Utrecht, Netherlands. The lab will focus on protein structure prediction and drug discovery using machine learning, starting from an empty 200 square meter space with basic utilities. The founding team is a principal investigator with 10 years of experience, two postdoctoral researchers, and three PhD students to be recruited in the first year. The total setup budget is EUR 2.5 million over three years, with EUR 1.2 million allocated for computational infrastructure including GPU clusters, EUR 800,000 for personnel in the first year, and EUR 500,000 for equipment and operating costs. The lab must be operational within 6 months of funding approval, with the GPU cluster provisioned and the first research project producing preliminary results within 9 months. Regulatory requirements include university ethics board approval for any human-derived data usage, compliance with Dutch research data management guidelines, and adherence to FAIR data principles. Success is measured by submitting two high-impact journal papers within 18 months, establishing one industry partnership for drug discovery collaboration, and training all team members in responsible AI practices for biological applications.", "tags": ["agent", "research", "biology", "university", "business"]}
{"id": "ad502ba5-c9fd-41ac-a80f-74207527b733", "prompt": "A planning agent is tasked with designing the rollout of a city-wide electric vehicle charging network across Prague, Czech Republic. The project is a public-private partnership between the Prague municipal government and a consortium of three energy companies. The network must cover all 22 administrative districts within 24 months, starting with the 6 highest-density districts in the first 8 months. The total investment is EUR 15 million, funded 40% by EU structural funds, 35% by the energy consortium, and 25% by the city budget. The plan must account for approximately 500 charging stations: 350 Level 2 stations for residential areas and 150 DC fast chargers along major corridors and near commercial centers. Key constraints include navigating Czech building and electrical permits (typically 3-6 months per site), coordinating with the Prague Power Grid operator for transformer capacity, and ensuring all stations support the Combined Charging System standard. The workforce includes a 12-person project management office, relationships with 4 certified electrical installation contractors, and a dedicated software team of 5 for the mobile app and backend payment system. Success criteria: 95% of Prague residents within 500 meters of a charging point, average station uptime of 98%, 10,000 registered users within 6 months of full deployment, and positive return on investment projected within 7 years.", "tags": ["agent", "infrastructure", "ev", "prague", "business"]}