From 07bedbbe29003a9630cdfbc8b8e12b7ae6afb3cb Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Mon, 30 Mar 2026 12:33:29 +0000
Subject: [PATCH 1/3] Initial plan


From f09c2d84dc49e3d3dcd1069a6c53722fc906ad54 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Mon, 30 Mar 2026 12:45:51 +0000
Subject: [PATCH 2/3] =?UTF-8?q?feat:=20task=20decomposition=20research=20?=
 =?UTF-8?q?=E2=80=94=20workspace=20detection,=20planner=20context,=20workf?=
 =?UTF-8?q?low=20templates?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Agent-Logs-Url: https://github.com/huberp/agentloop/sessions/3a8fc459-0fd4-4ab9-8e11-a59e4d76960b

Co-authored-by: huberp <4027454+huberp@users.noreply.github.com>
---
 issues/2.md                                   | 208 ++++++++++++++++++
 src/__tests__/builtin-agent-profiles.test.ts  |   4 +-
 src/__tests__/builtin-skills.test.ts          |   4 +-
 .../fixtures/workspace-cargo/Cargo.toml       |   4 +
 .../workspace-cmake-presets/CMakeLists.txt    |   0
 .../workspace-cmake-presets/CMakePresets.json |   1 +
 .../fixtures/workspace-cmake/CMakeLists.txt   |   0
 .../workspace-gradle-kotlin/build.gradle.kts  |   1 +
 .../fixtures/workspace-gradle/build.gradle    |   1 +
 .../fixtures/workspace-maven/pom.xml          |   1 +
 src/__tests__/workspace.test.ts               | 160 ++++++++++++++
 src/agents/builtin/build-verify.agent.json    |  12 +
 src/agents/builtin/test-runner.agent.json     |  12 +
 src/skills/builtin/build-verify.skill.md      |  43 ++++
 src/skills/builtin/cmake-workflow.skill.md    |  82 +++++++
 src/subagents/planner.ts                      |  14 +-
 src/workspace.ts                              | 124 ++++++++++-
 17 files changed, 664 insertions(+), 7 deletions(-)
 create mode 100644 issues/2.md
 create mode 100644 src/__tests__/fixtures/workspace-cargo/Cargo.toml
 create mode 100644 src/__tests__/fixtures/workspace-cmake-presets/CMakeLists.txt
 create mode 100644 src/__tests__/fixtures/workspace-cmake-presets/CMakePresets.json
 create mode 100644 src/__tests__/fixtures/workspace-cmake/CMakeLists.txt
 create mode 100644 src/__tests__/fixtures/workspace-gradle-kotlin/build.gradle.kts
 create mode 100644 src/__tests__/fixtures/workspace-gradle/build.gradle
 create mode 100644 src/__tests__/fixtures/workspace-maven/pom.xml
 create mode 100644 src/agents/builtin/build-verify.agent.json
 create mode 100644 src/agents/builtin/test-runner.agent.json
 create mode 100644 src/skills/builtin/build-verify.skill.md
 create mode 100644 src/skills/builtin/cmake-workflow.skill.md

diff --git a/issues/2.md b/issues/2.md
new file mode 100644
index 00000000..402e5406
--- /dev/null
+++ b/issues/2.md
@@ -0,0 +1,208 @@
+## Research: Task Planner and Task Decomposition for Coding Agents
+
+### 1. Problem Statement
+
+Modern AI coding agents need to tackle tasks that span multiple steps, require diverse tools, and benefit from specialised domain knowledge at each step. The key challenge is **task decomposition**: how should a high-level goal (e.g. "add input validation to all POST handlers") be broken into concrete, executable steps that an agent can carry out reliably?
+
+A related challenge is **workflow templates**: many coding workflows (build, test, lint, release) have a fixed *shape* but vary in their concrete commands depending on the workspace. Can these patterns be captured once and reused across projects?
+
+---
+
+### 2. Baseline: The `plan-and-run` Loop in agentloop
+
+agentloop already ships a layered planning architecture:
+
+| Component | Location | Role |
+|---|---|---|
+| `generatePlan` | `src/subagents/planner.ts` | LLM-powered decomposition of a goal into `PlanStep[]` |
+| `refinePlan` | `src/subagents/planner.ts` | Corrects a plan that references unknown tools |
+| `validatePlan` | `src/subagents/planner.ts` | Checks that all tool names in the plan are registered |
+| `executePlan` | `src/orchestrator.ts` | Runs steps in sequence; supports `retry/skip/abort` on failure and checkpoint/resume |
+| `plan` tool | `src/tools/plan.ts` | Exposes plan generation to the agent loop as a callable tool |
+| `plan-and-run` tool | `src/tools/plan-and-run.ts` | Combines generation + execution in one tool call |
+
+The planner runs as a **tool-free subagent**: it receives workspace context and a list of available tools, then returns a JSON plan. The orchestrator dispatches each step to a `runSubagent` call (or `SubagentManager` for complex steps) with an iteration budget derived from `estimatedComplexity`.
+
+**Key insight already present in agentloop**: each `PlanStep` carries an optional `agentProfile` field. The orchestrator activates the named profile (model, temperature, tool subset) for that step. This enables per-step specialisation without a separate orchestration framework.
+
+---
+
+### 3. What Other Frameworks Do
+
+#### 3.1 LangGraph (LangChain)
+- Models agent behaviour as a **directed graph** of nodes (LLM calls, tool calls) with conditional edges.
+- Supports cycles (retry loops), parallel fan-out/fan-in, and human-in-the-loop interrupts.
+- Templates are *graph patterns* stored as reusable subgraphs.
+- Complexity: full graph authoring required for every new workflow shape.
+
+#### 3.2 AutoGen (Microsoft)
+- Multi-agent conversation: a **Planner** agent, a **Coder** agent, and an **Executor** agent exchange messages until the task is done.
+- Task decomposition happens in natural language — the Planner emits step descriptions that the Coder implements.
+- Workflow templates are **system prompts** for each role, often provided in a configuration YAML.
+- Strength: easy to add domain-expert agents. Weakness: conversation history grows rapidly; quality depends on message-passing discipline.
+
+#### 3.3 CrewAI
+- Defines **Crew** (team of agents), **Agents** (role + backstory + tools), and **Tasks** (description + expected output + dependencies).
+- Supports sequential and hierarchical execution; tasks can pass their output as context to dependent tasks.
+- Workflow templates are *Crew + Task YAML configurations* that can be parameterised and re-instantiated for different inputs.
+- Strong alignment with the "workflow template" concept in this issue: a `BuildVerifyCrew` YAML is a reusable template instantiated per workspace.
+
+#### 3.4 OpenAI Assistants + Structured Outputs
+- Persistent thread context allows multi-turn tasks without re-injecting history.
+- `run_step` objects provide a built-in audit trail of each tool call and its output.
+- Templates are **Assistant instructions** (system prompt) combined with few-shot examples in the thread.
+- Limitation: tied to the OpenAI API; no local model support.
+
+#### 3.5 Copilot Coding Agent (GitHub Copilot)
+The example in this issue shows a subtask with:
+```json
+{
+  "name": "build-verify",
+  "agent_type": "task",
+  "description": "Build the plugin to verify changes",
+  "prompt": "...\nSteps:\n1. Run: sudo bash scripts/install-linux-deps.sh\n2. Run: git submodule update --init --recursive\n3. Run: cmake --preset linux-release\n4. Run: cmake --build --preset linux-build -j2\n\nReport whether the build succeeded or failed..."
+}
+```
+
+Key observations:
+- The template name (`build-verify`) is **stable and reusable** across tasks.
+- The concrete steps (cmake preset names, script paths) are **workspace-specific** and were derived from workspace knowledge.
+- Instantiation happens once, at workspace-setup time — not re-derived on every task.
+- This is equivalent to agentloop's `agentProfile` + `skill` combination, but the steps are baked into the prompt string rather than generated by a planner.
+
+---
+
+### 4. Template Taxonomy for Coding Agents
+
+Based on the above analysis, coding workflow templates fall into three categories:
+
+#### Category A — Build Lifecycle Templates
+Fixed structure, workspace-specific commands:
+
+| Template | Shape | Workspace-specific parts |
+|---|---|---|
+| `build-verify` | configure → compile → report | preset name, script paths, parallelism flag |
+| `clean-build` | clean → configure → compile | build directory, preset/profile |
+| `release-package` | build → test → package → sign | packaging format, signing key |
+
+#### Category B — Quality Gate Templates
+Fixed checklist, tool-specific commands:
+
+| Template | Shape | Workspace-specific parts |
+|---|---|---|
+| `test-and-fix` | run tests → parse failures → locate code → fix → re-run | test runner command, test output format |
+| `lint-and-format` | run linter → parse output → apply fixes → re-verify | linter binary, fix flags |
+| `security-scan` | run scanner → parse findings → generate report | scanner CLI, severity threshold |
+
+#### Category C — Development Workflow Templates
+Higher-level patterns:
+
+| Template | Shape | Notes |
+|---|---|---|
+| `feature-branch` | branch → implement → test → PR | Uses git tools + planner |
+| `dependency-update` | audit → update → test → commit | Integrates vulnerability check |
+| `hotfix` | branch from tag → apply fix → test → backport | Requires git-log, cherry-pick |
+
+---
+
+### 5. How agentloop Can Implement Workflow Templates
+
+agentloop's existing primitives map cleanly onto the template concept:
+
+#### 5.1 Templates as Agent Profiles + Skills (recommended)
+
+A workflow template = **agent profile** (what tools, model, iteration budget) + **skill** (domain knowledge, step sequence, error heuristics).
+
+Example — `build-verify` profile (`src/agents/builtin/build-verify.agent.json`):
+```json
+{
+  "name": "build-verify",
+  "description": "Build verification agent — compiles the workspace and reports success or failure",
+  "temperature": 0.1,
+  "skills": ["build-verify"],
+  "tools": ["shell", "file-read", "file-list"],
+  "maxIterations": 10
+}
+```
+
+The paired `build-verify` skill (`src/skills/builtin/build-verify.skill.md`) injects:
+- Step sequence (identify build system → install deps → configure → compile → report)
+- Error triage heuristics (linker errors, missing headers, stale cache)
+- Parallelism flags per build tool
+
+The planner can then annotate a step with `"agentProfile": "build-verify"` and the orchestrator will activate the matching profile for that step — automatically binding the right skill, tool subset, and temperature.
+
+#### 5.2 Templates as Planner Context (workspace-aware instantiation)
+
+The planner prompt includes `workspaceInfo` fields including the detected lifecycle commands (`buildCommand`, `testCommand`, `lintCommand`). This allows the planner to produce **concrete, workspace-specific steps** in one shot:
+
+```
+Workspace: language=cmake, packageManager=cmake,
+  build="cmake --preset linux-release && cmake --build --preset linux-build",
+  test="ctest --preset linux-test"
+```
+
+The planner output then directly embeds the correct commands rather than using a generic placeholder.
+
+#### 5.3 Template Instantiation: Agent vs Static
+
+| Approach | When to use | Trade-offs |
+|---|---|---|
+| **Planner-time instantiation** (current) | Novel tasks, unknown workspaces | Flexible, adapts to workspace; requires LLM call |
+| **Profile+skill pre-configuration** (new) | Recurring workflows (CI, build-verify) | Fast, deterministic, version-controlled; less adaptive |
+| **Hybrid** (recommended) | Plan overall task, but use pre-defined profiles per step | Best of both worlds |
+
+The hybrid approach is already supported: the planner annotates `agentProfile` on steps, and the orchestrator activates the profile. Adding skills that encode the step sequence means the profile-activated agent "knows" the right procedure without the planner having to enumerate every sub-step.
+
+---
+
+### 6. Concrete Example: CMake Build-Verify Flow
+
+**Goal**: "Build the plugin to verify changes compile correctly"
+
+**Planner output** (with workspace context `build="cmake --preset linux-release && cmake --build --preset linux-build -j2"`):
+
+```json
+{
+  "steps": [
+    {
+      "description": "Install Linux build dependencies",
+      "toolsNeeded": ["shell"],
+      "estimatedComplexity": "low",
+      "agentProfile": "devops"
+    },
+    {
+      "description": "Update git submodules",
+      "toolsNeeded": ["shell"],
+      "estimatedComplexity": "low",
+      "agentProfile": "devops"
+    },
+    {
+      "description": "Build the project using cmake --preset linux-release && cmake --build --preset linux-build -j2 and report compiler output",
+      "toolsNeeded": ["shell"],
+      "estimatedComplexity": "medium",
+      "agentProfile": "build-verify"
+    }
+  ]
+}
+```
+
+The `build-verify` agent profile activates the `build-verify` skill, which provides the step sequence and error triage guidance. The concrete commands come from `workspaceInfo.buildCommand`, injected into the planner prompt.
+
+---
+
+### 7. Recommendations and Gaps Addressed
+
+| Gap | Solution implemented |
+|---|---|
+| Planner didn't know lifecycle commands | ✅ `buildPlannerTask` now includes `build`, `test`, `lint` commands from `WorkspaceInfo` |
+| Only Node/Python/Go workspace detection | ✅ Added CMake, Rust/Cargo, Gradle, Maven analyzers in `workspace.ts` |
+| No build-workflow agent profile | ✅ `build-verify.agent.json` and `test-runner.agent.json` added |
+| No build-workflow skill | ✅ `build-verify.skill.md` and `cmake-workflow.skill.md` added |
+
+### 8. Remaining Open Questions
+
+1. **Template registry**: Should templates be discoverable at runtime (e.g. `list-templates` tool) so the planner can reference them by name? The current profile registry partially serves this role.
+2. **Workspace-once vs task-every-time**: For expensive workspace analysis (submodule init, dependency install), should a "workspace setup" template run once at session start and cache results? This aligns with CrewAI's `before_kickoff` hook concept.
+3. **Multi-repo / monorepo**: `analyzeWorkspace` currently detects one build system per root. Monorepos with mixed build systems (e.g. a CMake C++ library + a Node.js frontend) need a recursive scan.
+4. **Template versioning**: When the workspace changes (new preset, renamed script), how are baked templates kept in sync? A solution is to keep commands in `WorkspaceInfo` (auto-detected) rather than hard-coding them in profile prompts.
diff --git a/src/__tests__/builtin-agent-profiles.test.ts b/src/__tests__/builtin-agent-profiles.test.ts
index ed5fa0d7..1078a492 100644
--- a/src/__tests__/builtin-agent-profiles.test.ts
+++ b/src/__tests__/builtin-agent-profiles.test.ts
@@ -24,8 +24,8 @@ beforeAll(async () => {
 });
 
 describe("builtin agent profiles", () => {
-  it("loads exactly 5 builtin profiles", () => {
-    expect(registry.list()).toHaveLength(5);
+  it("loads exactly 7 builtin profiles", () => {
+    expect(registry.list()).toHaveLength(7);
   });
 
   it("coder profile has name === 'coder' and model === 'gpt-4o'", () => {
diff --git a/src/__tests__/builtin-skills.test.ts b/src/__tests__/builtin-skills.test.ts
index 2d6e48d7..b2107d10 100644
--- a/src/__tests__/builtin-skills.test.ts
+++ b/src/__tests__/builtin-skills.test.ts
@@ -20,9 +20,11 @@ describe("built-in skill library", () => {
     "test-writer",
     "git-workflow",
     "security-auditor",
+    "build-verify",
+    "cmake-workflow",
   ];
 
-  it("loads all 5 built-in skills", () => {
+  it("loads all 7 built-in skills", () => {
     const names = registry.list().map((s) => s.name);
     for (const name of BUILTIN_NAMES) {
       expect(names).toContain(name);
diff --git a/src/__tests__/fixtures/workspace-cargo/Cargo.toml b/src/__tests__/fixtures/workspace-cargo/Cargo.toml
new file mode 100644
index 00000000..965b5937
--- /dev/null
+++ b/src/__tests__/fixtures/workspace-cargo/Cargo.toml
@@ -0,0 +1,4 @@
+[package]
+name = "my-app"
+version = "0.1.0"
+edition = "2021"
diff --git a/src/__tests__/fixtures/workspace-cmake-presets/CMakeLists.txt b/src/__tests__/fixtures/workspace-cmake-presets/CMakeLists.txt
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-cmake-presets/CMakePresets.json b/src/__tests__/fixtures/workspace-cmake-presets/CMakePresets.json
new file mode 100644
index 00000000..62e6c2f0
--- /dev/null
+++ b/src/__tests__/fixtures/workspace-cmake-presets/CMakePresets.json
@@ -0,0 +1 @@
+{"version":3,"cmakeMinimumRequired":{"major":3,"minor":21},"configurePresets":[{"name":"default","binaryDir":"build"}],"buildPresets":[{"name":"default","configurePreset":"default"}],"testPresets":[{"name":"default","configurePreset":"default"}]}
diff --git a/src/__tests__/fixtures/workspace-cmake/CMakeLists.txt b/src/__tests__/fixtures/workspace-cmake/CMakeLists.txt
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-gradle-kotlin/build.gradle.kts b/src/__tests__/fixtures/workspace-gradle-kotlin/build.gradle.kts
new file mode 100644
index 00000000..5b1dae2a
--- /dev/null
+++ b/src/__tests__/fixtures/workspace-gradle-kotlin/build.gradle.kts
@@ -0,0 +1 @@
+plugins { kotlin("jvm") version "1.9.0" }
diff --git a/src/__tests__/fixtures/workspace-gradle/build.gradle b/src/__tests__/fixtures/workspace-gradle/build.gradle
new file mode 100644
index 00000000..b95276ac
--- /dev/null
+++ b/src/__tests__/fixtures/workspace-gradle/build.gradle
@@ -0,0 +1 @@
+plugins { id("java") }
diff --git a/src/__tests__/fixtures/workspace-maven/pom.xml b/src/__tests__/fixtures/workspace-maven/pom.xml
new file mode 100644
index 00000000..12ac61cc
--- /dev/null
+++ b/src/__tests__/fixtures/workspace-maven/pom.xml
@@ -0,0 +1 @@
+<project><modelVersion>4.0.0</modelVersion><groupId>com.example</groupId><artifactId>my-app</artifactId><version>1.0</version></project>
diff --git a/src/__tests__/workspace.test.ts b/src/__tests__/workspace.test.ts
index 718b6169..df50e200 100644
--- a/src/__tests__/workspace.test.ts
+++ b/src/__tests__/workspace.test.ts
@@ -118,3 +118,163 @@ describe("analyzeWorkspace — git detection", () => {
     expect(info.gitInitialized).toBe(true);
   });
 });
+
+describe("analyzeWorkspace — Rust/Cargo project", () => {
+  const root = path.join(fixturesDir, "workspace-cargo");
+
+  let info: WorkspaceInfo;
+  beforeAll(async () => {
+    info = await analyzeWorkspace(root);
+  });
+
+  it("detects language as 'rust'", () => {
+    expect(info.language).toBe("rust");
+  });
+
+  it("uses 'cargo' as the package manager", () => {
+    expect(info.packageManager).toBe("cargo");
+  });
+
+  it("defaults the build command to 'cargo build'", () => {
+    expect(info.buildCommand).toBe("cargo build");
+  });
+
+  it("defaults the test command to 'cargo test'", () => {
+    expect(info.testCommand).toBe("cargo test");
+  });
+
+  it("defaults the lint command to 'cargo clippy'", () => {
+    expect(info.lintCommand).toBe("cargo clippy");
+  });
+
+  it("reports hasTests as true when a tests/ directory exists", () => {
+    expect(info.hasTests).toBe(true);
+  });
+});
+
+describe("analyzeWorkspace — CMake project (no presets)", () => {
+  const root = path.join(fixturesDir, "workspace-cmake");
+
+  let info: WorkspaceInfo;
+  beforeAll(async () => {
+    info = await analyzeWorkspace(root);
+  });
+
+  it("detects language as 'cmake'", () => {
+    expect(info.language).toBe("cmake");
+  });
+
+  it("uses 'cmake' as the package manager", () => {
+    expect(info.packageManager).toBe("cmake");
+  });
+
+  it("uses classic out-of-source build command when no presets file is present", () => {
+    expect(info.buildCommand).toBe("cmake -S . -B build && cmake --build build");
+  });
+
+  it("defaults the test command to ctest", () => {
+    expect(info.testCommand).toBe("ctest --output-on-failure");
+  });
+
+  it("reports hasTests as true when a tests/ directory exists", () => {
+    expect(info.hasTests).toBe(true);
+  });
+});
+
+describe("analyzeWorkspace — CMake project (with CMakePresets.json)", () => {
+  const root = path.join(fixturesDir, "workspace-cmake-presets");
+
+  let info: WorkspaceInfo;
+  beforeAll(async () => {
+    info = await analyzeWorkspace(root);
+  });
+
+  it("detects language as 'cmake'", () => {
+    expect(info.language).toBe("cmake");
+  });
+
+  it("uses preset-based build command when CMakePresets.json is present", () => {
+    expect(info.buildCommand).toBe(
+      "cmake --preset default && cmake --build --preset default"
+    );
+  });
+
+  it("uses preset-based test command when CMakePresets.json is present", () => {
+    expect(info.testCommand).toBe("ctest --preset default");
+  });
+});
+
+describe("analyzeWorkspace — Gradle (Java) project", () => {
+  const root = path.join(fixturesDir, "workspace-gradle");
+
+  let info: WorkspaceInfo;
+  beforeAll(async () => {
+    info = await analyzeWorkspace(root);
+  });
+
+  it("detects language as 'java'", () => {
+    expect(info.language).toBe("java");
+  });
+
+  it("uses 'gradle' as the package manager", () => {
+    expect(info.packageManager).toBe("gradle");
+  });
+
+  it("uses 'gradle build' as the build command (no gradlew wrapper)", () => {
+    expect(info.buildCommand).toBe("gradle build");
+  });
+
+  it("uses 'gradle test' as the test command", () => {
+    expect(info.testCommand).toBe("gradle test");
+  });
+
+  it("reports hasTests as true when src/test exists", () => {
+    expect(info.hasTests).toBe(true);
+  });
+});
+
+describe("analyzeWorkspace — Gradle (Kotlin DSL) project", () => {
+  const root = path.join(fixturesDir, "workspace-gradle-kotlin");
+
+  let info: WorkspaceInfo;
+  beforeAll(async () => {
+    info = await analyzeWorkspace(root);
+  });
+
+  it("detects language as 'kotlin'", () => {
+    expect(info.language).toBe("kotlin");
+  });
+
+  it("uses 'gradle' as the package manager", () => {
+    expect(info.packageManager).toBe("gradle");
+  });
+});
+
+describe("analyzeWorkspace — Maven project", () => {
+  const root = path.join(fixturesDir, "workspace-maven");
+
+  let info: WorkspaceInfo;
+  beforeAll(async () => {
+    info = await analyzeWorkspace(root);
+  });
+
+  it("detects language as 'java'", () => {
+    expect(info.language).toBe("java");
+  });
+
+  it("uses 'maven' as the package manager", () => {
+    expect(info.packageManager).toBe("maven");
+  });
+
+  it("uses 'mvn package -DskipTests' as the build command (no wrapper)", () => {
+    expect(info.buildCommand).toBe("mvn package -DskipTests");
+  });
+
+  it("uses 'mvn test' as the test command", () => {
+    expect(info.testCommand).toBe("mvn test");
+  });
+
+  it("reports hasTests as true when src/test exists", () => {
+    expect(info.hasTests).toBe(true);
+  });
+});
diff --git a/src/agents/builtin/build-verify.agent.json b/src/agents/builtin/build-verify.agent.json
new file mode 100644
index 00000000..6fec92e5
--- /dev/null
+++ b/src/agents/builtin/build-verify.agent.json
@@ -0,0 +1,12 @@
+{
+  "name": "build-verify",
+  "description": "Build verification agent — compiles the workspace and reports success or failure with compiler diagnostics",
+  "version": "1.0.0",
+  "temperature": 0.1,
+  "skills": ["build-verify"],
+  "tools": ["shell", "file-read", "file-list"],
+  "maxIterations": 10,
+  "constraints": {
+    "requireConfirmation": []
+  }
+}
diff --git a/src/agents/builtin/test-runner.agent.json b/src/agents/builtin/test-runner.agent.json
new file mode 100644
index 00000000..64a992ea
--- /dev/null
+++ b/src/agents/builtin/test-runner.agent.json
@@ -0,0 +1,12 @@
+{
+  "name": "test-runner",
+  "description": "Test execution agent — runs the project test suite, reports failures, and suggests targeted fixes",
+  "version": "1.0.0",
+  "temperature": 0.2,
+  "skills": ["test-writer"],
+  "tools": ["shell", "file-read", "file-write", "file-edit", "file-list", "code-search"],
+  "maxIterations": 20,
+  "constraints": {
+    "requireConfirmation": []
+  }
+}
diff --git a/src/skills/builtin/build-verify.skill.md b/src/skills/builtin/build-verify.skill.md
new file mode 100644
index 00000000..699a76f5
--- /dev/null
+++ b/src/skills/builtin/build-verify.skill.md
@@ -0,0 +1,43 @@
+---
+name: build-verify
+description: Workflow guidance for verifying that a project compiles and links correctly
+version: 1.0.0
+slot: section
+---
+
+## Build Verification Workflow
+
+The goal of this workflow is to confirm the project compiles cleanly and to surface any errors with actionable context.
+
+### Step sequence
+
+1. **Identify the build system** — inspect the workspace root for `CMakeLists.txt`, `Cargo.toml`, `package.json`, `build.gradle`, or `pom.xml` to determine which build tool to invoke.
+2. **Install / update dependencies** — run the dependency installation step *before* building:
+   - CMake: `git submodule update --init --recursive` (if submodules present)
+   - Node: `npm ci` or `yarn install --frozen-lockfile`
+   - Rust: `cargo fetch`
+   - Gradle: `./gradlew dependencies` (optional)
+3. **Configure the build** (if required):
+   - CMake: `cmake -S . -B build [-DCMAKE_BUILD_TYPE=Release]` or `cmake --preset <preset>`
+   - Gradle: no separate configure step
+4. **Compile**:
+   - CMake: `cmake --build build [--parallel $(nproc)]` or `cmake --build --preset <preset>`
+   - Node: `npm run build`
+   - Rust: `cargo build [--release]`
+   - Gradle: `./gradlew assemble` (compile only, no tests)
+   - Maven: `mvn package -DskipTests`
+5. **Report** — emit a structured summary: overall status (success/failure), number of errors and warnings, and the first 20 lines of compiler output for failures.
+
+### Error triage heuristics
+
+- **Linker errors** (`undefined reference`, `unresolved symbol`): check `CMakeLists.txt` for missing `target_link_libraries` entries; for Gradle check `dependencies` block.
+- **Missing headers / imports**: confirm that all required packages are declared in the manifest and that dependency installation succeeded in step 2.
+- **Type / compilation errors** in generated code: regenerate protobuf, Thrift, or OpenAPI sources before building.
+- **Out-of-date build cache**: perform a clean build (`rm -rf build && cmake …` or `cargo clean && cargo build`) to rule out stale artifacts.
+
+### Parallel build flag
+
+When invoking multi-core builds, pass a parallelism flag to keep wall-clock time low:
+- CMake/Ninja: `--parallel $(nproc)` or `-j$(nproc)`
+- Maven: `-T 1C` (one thread per CPU core)
+- Gradle: `--parallel`
diff --git a/src/skills/builtin/cmake-workflow.skill.md b/src/skills/builtin/cmake-workflow.skill.md
new file mode 100644
index 00000000..856f0366
--- /dev/null
+++ b/src/skills/builtin/cmake-workflow.skill.md
@@ -0,0 +1,82 @@
+---
+name: cmake-workflow
+description: CMake-specific build, test, and packaging patterns including preset-based workflows
+version: 1.0.0
+slot: section
+---
+
+## CMake Workflow Guidelines
+
+### Project layout conventions
+
+- Source lives in `src/`; headers in `include/`; tests in `tests/` or `test/`.
+- Out-of-source builds go in `build/` (excluded from version control via `.gitignore`).
+- `CMakeLists.txt` at the repository root is the entry point; each subdirectory may have its own `CMakeLists.txt`.
+
+### Preset-based workflow (preferred when `CMakePresets.json` exists)
+
+```bash
+# Configure
+cmake --preset <preset-name>          # e.g. linux-release, debug, ci
+
+# Build
+cmake --build --preset <build-preset> [--parallel $(nproc)]
+
+# Test
+ctest --preset <test-preset> [--output-on-failure]
+```
+
+List available presets:
+```bash
+cmake --list-presets          # configure presets
+cmake --build --list-presets  # build presets
+ctest --list-presets          # test presets
+```
+
+### Classic out-of-source workflow (no presets)
+
+```bash
+# Configure (Release build, Ninja generator recommended)
+cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Release
+
+# Build (parallel)
+cmake --build build --parallel $(nproc)
+
+# Test
+cd build && ctest --output-on-failure
+```
+
+### Dependency management
+
+- **Submodules**: always run `git submodule update --init --recursive` before configuring.
+- **find_package**: ensure system libraries are installed (e.g. `sudo apt install libssl-dev`).
+- **FetchContent / CPM.cmake**: dependencies are downloaded during configure; verify internet access or a local cache is available.
+- **vcpkg / Conan**: run `vcpkg install` or `conan install .` before `cmake -S . -B build`.
+
+### Install-step dependencies pattern
+
+When a project ships a dependency-installation script (e.g. `scripts/install-linux-deps.sh`), run it *before* the CMake configure step:
+
+```bash
+sudo bash scripts/install-linux-deps.sh
+git submodule update --init --recursive
+cmake --preset <preset>
+cmake --build --preset <build-preset> --parallel $(nproc)
+```
+
+### Common CMake variables
+
+| Variable | Purpose |
+|---|---|
+| `CMAKE_BUILD_TYPE` | `Debug`, `Release`, `RelWithDebInfo`, `MinSizeRel` |
+| `CMAKE_INSTALL_PREFIX` | Install destination (default `/usr/local`) |
+| `CMAKE_TOOLCHAIN_FILE` | Cross-compile or vcpkg toolchain |
+| `BUILD_SHARED_LIBS` | `ON` to build shared libraries by default |
+| `CMAKE_EXPORT_COMPILE_COMMANDS` | `ON` to generate `compile_commands.json` for tooling |
+
+### Diagnosing build failures
+
+1. Check the **configure step** output first — missing dependencies abort here.
+2. Look for the **first** error in compiler output; subsequent errors are often cascading.
+3. Enable verbose output to see exact compiler flags: `cmake --build build --verbose` or `VERBOSE=1 make`.
+4. Use `--fresh` flag to force a clean reconfigure: `cmake --fresh --preset <preset>`.
diff --git a/src/subagents/planner.ts b/src/subagents/planner.ts
index 82c77be1..624c6c56 100644
--- a/src/subagents/planner.ts
+++ b/src/subagents/planner.ts
@@ -83,8 +83,18 @@ function buildPlannerTask(
   let result =
     `Task: ${task}\n` +
     `Workspace: language=${workspaceInfo.language}, framework=${workspaceInfo.framework}, ` +
-    `packageManager=${workspaceInfo.packageManager}, gitInitialized=${workspaceInfo.gitInitialized}\n` +
-    `Available tools: ${toolList}`;
+    `packageManager=${workspaceInfo.packageManager}, gitInitialized=${workspaceInfo.gitInitialized}`;
+
+  // Include lifecycle commands so the planner can generate concrete, workspace-specific steps
+  const lifecycleLines: string[] = [];
+  if (workspaceInfo.buildCommand) lifecycleLines.push(`build="${workspaceInfo.buildCommand}"`);
+  if (workspaceInfo.testCommand) lifecycleLines.push(`test="${workspaceInfo.testCommand}"`);
+  if (workspaceInfo.lintCommand) lifecycleLines.push(`lint="${workspaceInfo.lintCommand}"`);
+  if (lifecycleLines.length > 0) {
+    result += `, ${lifecycleLines.join(", ")}`;
+  }
+
+  result += `\nAvailable tools: ${toolList}`;
   if (availableProfiles && availableProfiles.length > 0) {
     const profileList = availableProfiles.map((p) => `${p.name}: ${p.description}`).join("; ");
     result += `\nAvailable agent profiles: ${profileList}`;
diff --git a/src/workspace.ts b/src/workspace.ts
index 84b19524..9ce29008 100644
--- a/src/workspace.ts
+++ b/src/workspace.ts
@@ -3,11 +3,11 @@ import * as path from "path";
 
 /** Structured information about the project workspace. */
 export interface WorkspaceInfo {
-  /** Primary language detected: 'node', 'python', 'go', or 'unknown'. */
+  /** Primary language detected: 'node', 'python', 'go', 'rust', 'cmake', or 'unknown'. */
   language: string;
   /** Framework detected from dependencies (e.g. 'react', 'django'), or 'none'. */
   framework: string;
-  /** Package manager inferred from lock files or language (e.g. 'npm', 'pip'). */
+  /** Package manager inferred from lock files or language (e.g. 'npm', 'pip', 'cargo', 'gradle'). */
   packageManager: string;
   /** True if a test directory or test script was found. */
   hasTests: boolean;
@@ -168,6 +168,115 @@ async function analyzeGo(rootPath: string): Promise<Partial<WorkspaceInfo>> {
   return info;
 }
 
+/**
+ * Analyse a Rust/Cargo workspace.
+ * Reads Cargo.toml for basic metadata and checks for a `tests/` directory.
+ */
+async function analyzeCargo(rootPath: string): Promise<Partial<WorkspaceInfo>> {
+  const info: Partial<WorkspaceInfo> = {
+    language: "rust",
+    packageManager: "cargo",
+    testCommand: "cargo test",
+    lintCommand: "cargo clippy",
+    buildCommand: "cargo build",
+  };
+
+  // Override defaults with Makefile targets when available
+  const make = await parseMakefileTargets(rootPath);
+  if (make["test"]) info.testCommand = make["test"];
+  if (make["lint"]) info.lintCommand = make["lint"];
+  if (make["build"]) info.buildCommand = make["build"];
+
+  // Consider tests present if a tests/ directory or any #[cfg(test)] usage exists
+  info.hasTests =
+    (await exists(path.join(rootPath, "tests"))) ||
+    (await exists(path.join(rootPath, "src", "tests")));
+
+  return info;
+}
+
+/**
+ * Analyse a CMake workspace.
+ * Reads CMakeLists.txt for basic metadata and suggests cmake preset commands
+ * when a CMakePresets.json file is present.
+ */
+async function analyzeCMake(rootPath: string): Promise<Partial<WorkspaceInfo>> {
+  const info: Partial<WorkspaceInfo> = {
+    language: "cmake",
+    packageManager: "cmake",
+    testCommand: "ctest --output-on-failure",
+    lintCommand: "",
+    buildCommand: "cmake --build build",
+  };
+
+  // When CMakePresets.json is present, recommend the preset-based workflow
+  if (await exists(path.join(rootPath, "CMakePresets.json"))) {
+    info.buildCommand = "cmake --preset default && cmake --build --preset default";
+    info.testCommand = "ctest --preset default";
+  } else {
+    // Classic out-of-source build pattern
+    info.buildCommand = "cmake -S . -B build && cmake --build build";
+  }
+
+  // Override with Makefile targets when available (common for CMake super-builds)
+  const make = await parseMakefileTargets(rootPath);
+  if (make["test"]) info.testCommand = make["test"];
+  if (make["build"]) info.buildCommand = make["build"];
+
+  // Detect tests by presence of a CTestTestfile, tests/ directory, or test subdirectory
+  info.hasTests =
+    (await exists(path.join(rootPath, "CTestTestfile.cmake"))) ||
+    (await exists(path.join(rootPath, "tests"))) ||
+    (await exists(path.join(rootPath, "test")));
+
+  return info;
+}
+
+/**
+ * Analyse a Gradle (Java/Kotlin/Android) workspace.
+ */
+async function analyzeGradle(rootPath: string): Promise<Partial<WorkspaceInfo>> {
+  // Prefer ./gradlew wrapper when present
+  const gradleCmd = (await exists(path.join(rootPath, "gradlew"))) ? "./gradlew" : "gradle";
+
+  const info: Partial<WorkspaceInfo> = {
+    language: "java",
+    packageManager: "gradle",
+    testCommand: `${gradleCmd} test`,
+    lintCommand: `${gradleCmd} check`,
+    buildCommand: `${gradleCmd} build`,
+  };
+
+  // Check for Kotlin DSL (build.gradle.kts) to refine the language label
+  if (await exists(path.join(rootPath, "build.gradle.kts"))) {
+    info.language = "kotlin";
+  }
+
+  info.hasTests = (await exists(path.join(rootPath, "src", "test")));
+
+  return info;
+}
+
+/**
+ * Analyse a Maven (Java) workspace.
+ */
+async function analyzeMaven(rootPath: string): Promise<Partial<WorkspaceInfo>> {
+  // Prefer ./mvnw wrapper when present
+  const mvnCmd = (await exists(path.join(rootPath, "mvnw"))) ? "./mvnw" : "mvn";
+
+  const info: Partial<WorkspaceInfo> = {
+    language: "java",
+    packageManager: "maven",
+    testCommand: `${mvnCmd} test`,
+    lintCommand: `${mvnCmd} verify`,
+    buildCommand: `${mvnCmd} package -DskipTests`,
+  };
+
+  info.hasTests = (await exists(path.join(rootPath, "src", "test")));
+
+  return info;
+}
+
 /**
  * Analyse the workspace rooted at `rootPath` and return a `WorkspaceInfo`
  * object.  Language is detected via well-known indicator files; commands are
@@ -201,6 +310,17 @@ export async function analyzeWorkspace(rootPath: string): Promise<WorkspaceInfo>
     langInfo = await analyzePython(rootPath);
   } else if (await exists(path.join(rootPath, "go.mod"))) {
     langInfo = await analyzeGo(rootPath);
+  } else if (await exists(path.join(rootPath, "Cargo.toml"))) {
+    langInfo = await analyzeCargo(rootPath);
+  } else if (await exists(path.join(rootPath, "CMakeLists.txt"))) {
+    langInfo = await analyzeCMake(rootPath);
+  } else if (
+    (await exists(path.join(rootPath, "build.gradle"))) ||
+    (await exists(path.join(rootPath, "build.gradle.kts")))
+  ) {
+    langInfo = await analyzeGradle(rootPath);
+  } else if (await exists(path.join(rootPath, "pom.xml"))) {
+    langInfo = await analyzeMaven(rootPath);
   }
 
   return { ...base, ...langInfo };

From 3e615c933263fb61eaf8f8d9917aeede284cd30c Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Mon, 30 Mar 2026 13:25:37 +0000
Subject: [PATCH 3/3] =?UTF-8?q?refactor:=20rewrite=20research=20doc=20to?=
 =?UTF-8?q?=20focus=20on=20dynamic=20intent=E2=86=92action=20transformatio?=
 =?UTF-8?q?n,=20remove=20cmake-specific=20skill?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Agent-Logs-Url: https://github.com/huberp/agentloop/sessions/d37e4f03-c06f-4113-a6da-f608dbd6b3f4

Co-authored-by: huberp <4027454+huberp@users.noreply.github.com>
---
 issues/2.md                                   | 400 +++++++++++-------
 src/__tests__/builtin-skills.test.ts          |   3 +-
 .../fixtures/workspace-cargo/tests/.keep      |   0
 .../fixtures/workspace-cmake/tests/.keep      |   0
 .../workspace-gradle-kotlin/src/test/.keep    |   0
 .../fixtures/workspace-gradle/src/test/.keep  |   0
 .../fixtures/workspace-maven/src/test/.keep   |   0
 src/skills/builtin/cmake-workflow.skill.md    |  82 ----
 8 files changed, 246 insertions(+), 239 deletions(-)
 create mode 100644 src/__tests__/fixtures/workspace-cargo/tests/.keep
 create mode 100644 src/__tests__/fixtures/workspace-cmake/tests/.keep
 create mode 100644 src/__tests__/fixtures/workspace-gradle-kotlin/src/test/.keep
 create mode 100644 src/__tests__/fixtures/workspace-gradle/src/test/.keep
 create mode 100644 src/__tests__/fixtures/workspace-maven/src/test/.keep
 delete mode 100644 src/skills/builtin/cmake-workflow.skill.md

diff --git a/issues/2.md b/issues/2.md
index 402e5406..67177cac 100644
--- a/issues/2.md
+++ b/issues/2.md
@@ -1,208 +1,298 @@
-## Research: Task Planner and Task Decomposition for Coding Agents
+## Research: Intent-to-Action Transformation — How a Generic Workflow Step Becomes Concrete
 
-### 1. Problem Statement
+### 1. The Core Problem
 
-Modern AI coding agents need to tackle tasks that span multiple steps, require diverse tools, and benefit from specialised domain knowledge at each step. The key challenge is **task decomposition**: how should a high-level goal (e.g. "add input validation to all POST handlers") be broken into concrete, executable steps that an agent can carry out reliably?
+A coding agent receives a generic intent such as **"verify-build"**. This is a template name
+that means "compile the project and confirm whether the build succeeds or fails". But the
+_concrete steps_ vary entirely by workspace:
 
-A related challenge is **workflow templates**: many coding workflows (build, test, lint, release) have a fixed *shape* but vary in their concrete commands depending on the workspace. Can these patterns be captured once and reused across projects?
+- For a CMake project with presets: `cmake --preset linux-release && cmake --build --preset linux-build -j2`
+- For a Node.js project: `npm ci && npm run build`
+- For a Rust project: `cargo build`
+- For a Gradle project: `./gradlew assemble`
 
----
+The question is: **what is the correct point in the machinery to perform this transformation,
+and which components are responsible for deriving the concrete steps?**
 
-### 2. Baseline: The `plan-and-run` Loop in agentloop
+---
 
-agentloop already ships a layered planning architecture:
+### 2. What Must NOT Happen — No Hardcoded Instantiation
 
-| Component | Location | Role |
-|---|---|---|
-| `generatePlan` | `src/subagents/planner.ts` | LLM-powered decomposition of a goal into `PlanStep[]` |
-| `refinePlan` | `src/subagents/planner.ts` | Corrects a plan that references unknown tools |
-| `validatePlan` | `src/subagents/planner.ts` | Checks that all tool names in the plan are registered |
-| `executePlan` | `src/orchestrator.ts` | Runs steps in sequence; supports `retry/skip/abort` on failure and checkpoint/resume |
-| `plan` tool | `src/tools/plan.ts` | Exposes plan generation to the agent loop as a callable tool |
-| `plan-and-run` tool | `src/tools/plan-and-run.ts` | Combines generation + execution in one tool call |
+The transformation must not be done by pre-wiring cmake commands (or any other build-system
+commands) into static configuration files. A hardcoded solution:
 
-The planner runs as a **tool-free subagent**: it receives workspace context and a list of available tools, then returns a JSON plan. The orchestrator dispatches each step to a `runSubagent` call (or `SubagentManager` for complex steps) with an iteration budget derived from `estimatedComplexity`.
+- Cannot adapt when a project changes its build system or adds presets.
+- Does not scale across projects or workspaces.
+- Defeats the purpose of a coding agent that is supposed to _reason_ about its environment.
 
-**Key insight already present in agentloop**: each `PlanStep` carries an optional `agentProfile` field. The orchestrator activates the named profile (model, temperature, tool subset) for that step. This enables per-step specialisation without a separate orchestration framework.
+The transformation must be agent-driven and workspace-aware at runtime.
 
 ---
 
-### 3. What Other Frameworks Do
-
-#### 3.1 LangGraph (LangChain)
-- Models agent behaviour as a **directed graph** of nodes (LLM calls, tool calls) with conditional edges.
-- Supports cycles (retry loops), parallel fan-out/fan-in, and human-in-the-loop interrupts.
-- Templates are *graph patterns* stored as reusable subgraphs.
-- Complexity: full graph authoring required for every new workflow shape.
-
-#### 3.2 AutoGen (Microsoft)
-- Multi-agent conversation: a **Planner** agent, a **Coder** agent, and an **Executor** agent exchange messages until the task is done.
-- Task decomposition happens in natural language — the Planner emits step descriptions that the Coder implements.
-- Workflow templates are **system prompts** for each role, often provided in a configuration YAML.
-- Strength: easy to add domain-expert agents. Weakness: conversation history grows rapidly; quality depends on message-passing discipline.
-
-#### 3.3 CrewAI
-- Defines **Crew** (team of agents), **Agents** (role + backstory + tools), and **Tasks** (description + expected output + dependencies).
-- Supports sequential and hierarchical execution; tasks can pass their output as context to dependent tasks.
-- Workflow templates are *Crew + Task YAML configurations* that can be parameterised and re-instantiated for different inputs.
-- Strong alignment with the "workflow template" concept in this issue: a `BuildVerifyCrew` YAML is a reusable template instantiated per workspace.
-
-#### 3.4 OpenAI Assistants + Structured Outputs
-- Persistent thread context allows multi-turn tasks without re-injecting history.
-- `run_step` objects provide a built-in audit trail of each tool call and its output.
-- Templates are **Assistant instructions** (system prompt) combined with few-shot examples in the thread.
-- Limitation: tied to the OpenAI API; no local model support.
-
-#### 3.5 Copilot Coding Agent (GitHub Copilot)
-The example in this issue shows a subtask with:
-```json
-{
-  "name": "build-verify",
-  "agent_type": "task",
-  "description": "Build the plugin to verify changes",
-  "prompt": "...\nSteps:\n1. Run: sudo bash scripts/install-linux-deps.sh\n2. Run: git submodule update --init --recursive\n3. Run: cmake --preset linux-release\n4. Run: cmake --build --preset linux-build -j2\n\nReport whether the build succeeded or failed..."
-}
+### 3. Current agentloop Machinery and the Transformation Points
+
+agentloop already has the primitives for this transformation. The pipeline is:
+
+```
+Generic intent: "verify-build"
+       │
+       ▼
+[1] analyzeWorkspace()          ← detects build system, extracts lifecycle commands
+       │  WorkspaceInfo { language="cmake", buildCommand="cmake --preset…", … }
+       ▼
+[2] generatePlan() / planner    ← LLM reasons: intent + workspace → concrete PlanStep[]
+       │  PlanStep { description="Run cmake --preset linux-release …",
+       │             toolsNeeded=["shell"], agentProfile="build-verify" }
+       ▼
+[3] executePlan() / orchestrator ← activates per-step agent profile, runs subagent
+       │  build-verify profile → shell tool, low temperature, build-verify skill
+       ▼
+[4] StepResult { output="…compiler output…", status="success"|"failed" }
 ```
 
-Key observations:
-- The template name (`build-verify`) is **stable and reusable** across tasks.
-- The concrete steps (cmake preset names, script paths) are **workspace-specific** and were derived from workspace knowledge.
-- Instantiation happens once, at workspace-setup time — not re-derived on every task.
-- This is equivalent to agentloop's `agentProfile` + `skill` combination, but the steps are baked into the prompt string rather than generated by a planner.
+#### Step [1] — `analyzeWorkspace()` in `src/workspace.ts`
 
----
+This is the **workspace probe**. It inspects the repository root for indicator files
+(`CMakeLists.txt`, `Cargo.toml`, `package.json`, `build.gradle`, `pom.xml`, etc.) and extracts
+the concrete lifecycle commands (`buildCommand`, `testCommand`, `lintCommand`). The result is a
+`WorkspaceInfo` object — the source of truth for what commands the workspace actually uses.
 
-### 4. Template Taxonomy for Coding Agents
+This is the right place for build-system detection: once, per session, before planning.
+
+#### Step [2] — `generatePlan()` / `buildPlannerTask()` in `src/subagents/planner.ts`
+
+The planner LLM receives the generic intent plus the `WorkspaceInfo` context (including
+detected lifecycle commands) and reasons about what concrete steps to produce. This is the
+**intent-to-steps transformation**:
+
+```
+Task: verify-build
+Workspace: language=cmake, packageManager=cmake,
+  build="cmake --preset linux-release && cmake --build --preset linux-build",
+  test="ctest --preset default"
+Available tools: shell, file-read, file-list
+```
 
-Based on the above analysis, coding workflow templates fall into three categories:
+The LLM returns a plan with concrete step descriptions drawn from the workspace context. No
+static configuration is needed — the planner derives the steps dynamically from what
+`analyzeWorkspace()` found.
 
-#### Category A — Build Lifecycle Templates
-Fixed structure, workspace-specific commands:
+The planner can also annotate each step with an `agentProfile`, directing the orchestrator to
+activate a specialised agent (e.g. `build-verify`) for that step.
 
-| Template | Shape | Workspace-specific parts |
-|---|---|---|
-| `build-verify` | configure → compile → report | preset name, script paths, parallelism flag |
-| `clean-build` | clean → configure → compile | build directory, preset/profile |
-| `release-package` | build → test → package → sign | packaging format, signing key |
+#### Step [3] — `executePlan()` / `runStep()` in `src/orchestrator.ts`
 
-#### Category B — Quality Gate Templates
-Fixed checklist, tool-specific commands:
+The orchestrator executes each step as a `runSubagent` call. When a step carries an
+`agentProfile` annotation, `activateProfile()` loads the profile (tools, model, temperature,
+skills). The agent then has both the **concrete step description** (from the planner) and the
+**domain guidance** (from its skill) to execute reliably.
 
-| Template | Shape | Workspace-specific parts |
-|---|---|---|
-| `test-and-fix` | run tests → parse failures → locate code → fix → re-run | test runner command, test output format |
-| `lint-and-format` | run linter → parse output → apply fixes → re-verify | linter binary, fix flags |
-| `security-scan` | run scanner → parse findings → generate report | scanner CLI, severity threshold |
+---
 
-#### Category C — Development Workflow Templates
-Higher-level patterns:
+### 4. Which Interaction Patterns from the Baseline Research Are Essential
 
-| Template | Shape | Notes |
-|---|---|---|
-| `feature-branch` | branch → implement → test → PR | Uses git tools + planner |
-| `dependency-update` | audit → update → test → commit | Integrates vulnerability check |
-| `hotfix` | branch from tag → apply fix → test → backport | Requires git-log, cherry-pick |
+The baseline branch (`copilot/research-agent-fws`) identified eight gaps in agentloop. For
+intent-to-action transformation, three are directly essential:
 
 ---
 
-### 5. How agentloop Can Implement Workflow Templates
+#### 4.1 Plan-Execute-Verify Loop (Baseline Issue 3) — **Critical**
 
-agentloop's existing primitives map cleanly onto the template concept:
+**Why it matters for "verify-build"**: The word "verify" in the intent means the agent must
+_confirm_ that the build succeeded — not merely that the build process ran without throwing an
+exception. Today, `executePlan()` marks a step as `status: "success"` as soon as the subagent
+returns without throwing. A build that silently failed (zero exit code but wrong output, a
+`make` that skipped targets, a test that passed vacuously) is indistinguishable from a correct
+build.
 
-#### 5.1 Templates as Agent Profiles + Skills (recommended)
+**The missing piece**: A `VerificationAgent` (proposed in Issue 3) runs after each step and
+produces a structured `VerificationResult { passed, reasoning, issues[] }`. For a build step,
+the verifier checks: "Does the output contain evidence of a successful compilation? Are there
+error messages? Is the binary present?"
 
-A workflow template = **agent profile** (what tools, model, iteration budget) + **skill** (domain knowledge, step sequence, error heuristics).
+**Dynamic replanning on failure**: When the verifier flags the build as failed, the system
+calls `refinePlan()` with the verifier's feedback (e.g., "missing dependency X"). The
+orchestrator replaces the remaining steps with a revised plan that installs the dependency and
+retries the build. This is the essential self-correction loop for "verify-build".
 
-Example — `build-verify` profile (`src/agents/builtin/build-verify.agent.json`):
-```json
-{
-  "name": "build-verify",
-  "description": "Build verification agent — compiles the workspace and reports success or failure",
-  "temperature": 0.1,
-  "skills": ["build-verify"],
-  "tools": ["shell", "file-read", "file-list"],
-  "maxIterations": 10
-}
+**Interaction pattern** (from Issue 3):
 ```
+executePlan()
+  └─ for each step:
+       ├─ runStep()          ← executes the build command
+       ├─ verifyStep()       ← checks build output for success/failure
+       │    ├─ pass → next step
+       │    └─ fail → refinePlan(feedback) → re-execute
+       └─ checkpoint.save()
+```
+
+**Without this pattern**, a "verify-build" intent can only execute the build — it cannot
+actually verify the outcome.
 
-The paired `build-verify` skill (`src/skills/builtin/build-verify.skill.md`) injects:
-- Step sequence (identify build system → install deps → configure → compile → report)
-- Error triage heuristics (linker errors, missing headers, stale cache)
-- Parallelism flags per build tool
+---
 
-The planner can then annotate a step with `"agentProfile": "build-verify"` and the orchestrator will activate the matching profile for that step — automatically binding the right skill, tool subset, and temperature.
+#### 4.2 Dynamic Task Decomposition (Baseline Issue 4) — **Important**
 
-#### 5.2 Templates as Planner Context (workspace-aware instantiation)
+**Why it matters**: The "verify-build" intent may require sub-steps that cannot be known at
+planning time. For example:
+- The planner produces a step "run the build"
+- During execution, the build fails with "submodules not initialised"
+- The agent needs to inject a sub-step "git submodule update --init --recursive" _before_
+  retrying the build
 
-The planner prompt includes `workspaceInfo` fields including the detected lifecycle commands (`buildCommand`, `testCommand`, `lintCommand`). This allows the planner to produce **concrete, workspace-specific steps** in one shot:
+This is addressed by **Dynamic Task Decomposition** (Issue 4, section 4): a complex step can
+call a `decompose_task` tool at runtime to inject new sub-steps immediately after the current
+step. The orchestrator's `executePlan()` maintains a mutable steps list and inserts the new
+steps in-place.
 
+**Interaction pattern**:
 ```
-Workspace: language=cmake, packageManager=cmake,
-  build="cmake --preset linux-release && cmake --build --preset linux-build",
-  test="ctest --preset linux-test"
+executePlan()
+  ├─ steps = [... mutable list ...]
+  └─ step i: "Run build"
+       └─ subagent calls decompose_task({newSteps: [
+              { description: "Init submodules", … },
+              { description: "Re-run build", … }
+          ]})
+          → steps[i+1..] = [init-submodules, re-run-build, ...original-remaining-steps]
 ```
 
-The planner output then directly embeds the correct commands rather than using a generic placeholder.
+**Without this pattern**, intent-to-action transformation is only as good as the planner's
+initial plan. When the environment deviates from expectations (missing deps, wrong tool
+version, first-time setup required), the agent has no way to adapt mid-execution.
 
-#### 5.3 Template Instantiation: Agent vs Static
+---
 
-| Approach | When to use | Trade-offs |
-|---|---|---|
-| **Planner-time instantiation** (current) | Novel tasks, unknown workspaces | Flexible, adapts to workspace; requires LLM call |
-| **Profile+skill pre-configuration** (new) | Recurring workflows (CI, build-verify) | Fast, deterministic, version-controlled; less adaptive |
-| **Hybrid** (recommended) | Plan overall task, but use pre-defined profiles per step | Best of both worlds |
+#### 4.3 Hierarchical Delegation (Baseline Issue 4) — **Architectural**
 
-The hybrid approach is already supported: the planner annotates `agentProfile` on steps, and the orchestrator activates the profile. Adding skills that encode the step sequence means the profile-activated agent "knows" the right procedure without the planner having to enumerate every sub-step.
+**Why it matters**: At a higher level of organisation, a _coordinator agent_ can receive the
+"verify-build" intent and delegate workspace analysis and step instantiation to a child agent.
+This is the **Hierarchical pattern** from Issue 4.
+
+**Interaction pattern**:
+```
+Coordinator receives: "verify-build"
+  └─ calls delegate_subagent("workspace-analyst")
+       └─ workspace-analyst: reads workspace, returns WorkspaceInfo + recommended steps
+  └─ coordinator constructs a plan from the returned recommendations
+  └─ calls delegate_subagent("build-verify") with concrete step descriptions
+       └─ build-verify agent: executes build, returns structured result
+  └─ coordinator synthesises final report
+```
+
+This pattern separates concerns cleanly: the coordinator holds the intent, the workspace
+analyst provides grounding, and the build-verify agent executes. Today's planner partially
+plays the coordinator role, but it cannot delegate to a workspace analyst because it is a
+tool-free subagent that only outputs JSON. Hierarchical delegation would allow the planner to
+_actively probe_ the workspace via tool calls rather than relying on pre-computed
+`WorkspaceInfo`.
 
 ---
 
-### 6. Concrete Example: CMake Build-Verify Flow
-
-**Goal**: "Build the plugin to verify changes compile correctly"
-
-**Planner output** (with workspace context `build="cmake --preset linux-release && cmake --build --preset linux-build -j2"`):
-
-```json
-{
-  "steps": [
-    {
-      "description": "Install Linux build dependencies",
-      "toolsNeeded": ["shell"],
-      "estimatedComplexity": "low",
-      "agentProfile": "devops"
-    },
-    {
-      "description": "Update git submodules",
-      "toolsNeeded": ["shell"],
-      "estimatedComplexity": "low",
-      "agentProfile": "devops"
-    },
-    {
-      "description": "Build the project using cmake --preset linux-release && cmake --build --preset linux-build -j2 and report compiler output",
-      "toolsNeeded": ["shell"],
-      "estimatedComplexity": "medium",
-      "agentProfile": "build-verify"
-    }
-  ]
-}
-```
+#### 4.4 Toolbox Refiner (Baseline Issue 5) — **Supporting**
 
-The `build-verify` agent profile activates the `build-verify` skill, which provides the step sequence and error triage guidance. The concrete commands come from `workspaceInfo.buildCommand`, injected into the planner prompt.
+**Why it matters**: The build-verify agent only needs `shell`, `file-read`, and `file-list`.
+Exposing all 16+ registered tools dilutes the agent's focus and wastes context budget. The
+**Toolbox Refiner** (Issue 5) dynamically narrows the exposed tool set per invocation based on
+the step's declared `toolsNeeded` list and the task description.
+
+This is already partially addressed by the profile-based `tools[]` list in agent profiles.
+The Toolbox Refiner would make this dynamic (keyword or embedding matching) rather than
+requiring a manually-maintained allowlist per profile.
 
 ---
 
-### 7. Recommendations and Gaps Addressed
+### 5. The Role of Templates in the Dynamic System
+
+Templates (agent profiles + skills) play a supporting role — they are **not** the source of
+concrete steps. Their actual function is:
 
-| Gap | Solution implemented |
+| Template element | Role |
 |---|---|
-| Planner didn't know lifecycle commands | ✅ `buildPlannerTask` now includes `build`, `test`, `lint` commands from `WorkspaceInfo` |
-| Only Node/Python/Go workspace detection | ✅ Added CMake, Rust/Cargo, Gradle, Maven analyzers in `workspace.ts` |
-| No build-workflow agent profile | ✅ `build-verify.agent.json` and `test-runner.agent.json` added |
-| No build-workflow skill | ✅ `build-verify.skill.md` and `cmake-workflow.skill.md` added |
+| Agent profile (`tools`, `temperature`, `maxIterations`) | Shapes the execution environment for a step |
+| Skill (`promptFragment`) | Provides domain guidance to the agent running the step — what to look for, what errors mean, how to report |
+
+The **concrete steps** always come from the planner, which derives them from:
+1. The generic intent ("verify-build")
+2. The workspace context (`WorkspaceInfo` from `analyzeWorkspace()`)
+3. The available agent profiles (the planner can annotate `agentProfile` per step)
+
+A `build-verify` profile + skill gives the executing agent the knowledge to:
+- Identify which build system is in use (from the workspace `language` field)
+- Interpret compiler output (error triage heuristics in the skill)
+- Produce a structured success/failure report
+
+But the specific commands to run come from the workspace analysis, injected into the planner
+context at planning time.
+
+---
+
+### 6. Recommended Interaction Pattern: Full "verify-build" Flow
 
-### 8. Remaining Open Questions
+Combining the above, the complete agent-driven "verify-build" flow using agentloop components:
 
-1. **Template registry**: Should templates be discoverable at runtime (e.g. `list-templates` tool) so the planner can reference them by name? The current profile registry partially serves this role.
-2. **Workspace-once vs task-every-time**: For expensive workspace analysis (submodule init, dependency install), should a "workspace setup" template run once at session start and cache results? This aligns with CrewAI's `before_kickoff` hook concept.
-3. **Multi-repo / monorepo**: `analyzeWorkspace` currently detects one build system per root. Monorepos with mixed build systems (e.g. a CMake C++ library + a Node.js frontend) need a recursive scan.
-4. **Template versioning**: When the workspace changes (new preset, renamed script), how are baked templates kept in sync? A solution is to keep commands in `WorkspaceInfo` (auto-detected) rather than hard-coding them in profile prompts.
+```
+User: "verify-build"
+  │
+  ▼
+[A] analyzeWorkspace(rootPath)
+    → WorkspaceInfo { buildCommand="cmake --preset …", language="cmake", … }
+  │
+  ▼
+[B] generatePlan("verify-build", workspaceInfo, registry, profileRegistry)
+    → Plan {
+        steps: [
+          { description: "Run: cmake --preset …",
+            toolsNeeded: ["shell"], agentProfile: "build-verify" },
+          { description: "Report build result",
+            toolsNeeded: [], agentProfile: "build-verify" }
+        ]
+      }
+  │
+  ▼
+[C] executePlan(plan, registry, { verificationEnabled: true, task: "verify-build", workspaceInfo })
+    │
+    ├─ step 0: runStep() → shell("cmake --preset …") → output
+    │           verifyStep() → VerificationResult { passed, reasoning, issues }
+    │                └─ fail? → refinePlan(feedback) → re-execute
+    │
+    └─ step 1: runStep() → agent synthesises report from step 0 output
+                verifyStep() → confirm report contains success/failure conclusion
+  │
+  ▼
+ExecutionResult { stepResults, success, verificationResults }
+```
+
+The key properties of this flow:
+- **Generic intent, concrete execution**: "verify-build" is never mapped to cmake commands in
+  config — the planner derives them from workspace analysis.
+- **Self-correcting**: the PEV loop (Issue 3) catches silent failures and replans.
+- **Extensible**: adding support for a new build system requires only updating
+  `analyzeWorkspace()` — no profile or template changes needed.
+- **Composable**: the same flow applies to "run-tests", "lint", or any other lifecycle intent.
+
+---
+
+### 7. Gap Summary Relative to Baseline Research
+
+| Baseline Issue | Pattern | Essential for "verify-build"? | Current status |
+|---|---|---|---|
+| Issue 3 | Plan-Execute-Verify loop | ✅ Critical — without it, "verify" is just "run" | ❌ Not yet implemented |
+| Issue 3 | Dynamic replanning on verification failure | ✅ Critical — enables self-correction | ❌ Not yet implemented |
+| Issue 4 | Dynamic task decomposition | ✅ Important — handles mid-execution surprises | ❌ Not yet implemented |
+| Issue 4 | Hierarchical delegation | 🔶 Architectural — enables active workspace probing | ❌ Not yet implemented |
+| Issue 5 | Toolbox Refiner | 🔶 Supporting — reduces noise in build agent | ❌ Not yet implemented |
+| Issue 2 | Persistent memory | 🔶 Optional — cache workspace analysis across sessions | ❌ Not yet implemented |
+
+### 8. What Has Been Improved in This PR
+
+| Change | Effect |
+|---|---|
+| `analyzeWorkspace()` now detects CMake, Cargo, Gradle, Maven | Workspace analysis returns concrete lifecycle commands for more build systems |
+| `buildPlannerTask()` includes `buildCommand`/`testCommand`/`lintCommand` | Planner receives concrete command strings → produces workspace-specific plan steps without hardcoding |
+| `build-verify` and `test-runner` agent profiles | Execution environment for build/test steps — define which tools and temperature are appropriate |
+| `build-verify` skill | Domain guidance injected into the build agent — how to identify the build system, interpret output, triage errors |
+
+These improvements advance Step [1] (workspace analysis) and Step [2] (planner context) of the
+transformation pipeline. Steps [3] and [4] (verification and dynamic replanning) require the
+Plan-Execute-Verify implementation from Issue 3 to be complete.
diff --git a/src/__tests__/builtin-skills.test.ts b/src/__tests__/builtin-skills.test.ts
index b2107d10..3c96455d 100644
--- a/src/__tests__/builtin-skills.test.ts
+++ b/src/__tests__/builtin-skills.test.ts
@@ -21,10 +21,9 @@ describe("built-in skill library", () => {
     "git-workflow",
     "security-auditor",
     "build-verify",
-    "cmake-workflow",
   ];
 
-  it("loads all 7 built-in skills", () => {
+  it("loads all 6 built-in skills", () => {
     const names = registry.list().map((s) => s.name);
     for (const name of BUILTIN_NAMES) {
       expect(names).toContain(name);
diff --git a/src/__tests__/fixtures/workspace-cargo/tests/.keep b/src/__tests__/fixtures/workspace-cargo/tests/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-cmake/tests/.keep b/src/__tests__/fixtures/workspace-cmake/tests/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-gradle-kotlin/src/test/.keep b/src/__tests__/fixtures/workspace-gradle-kotlin/src/test/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-gradle/src/test/.keep b/src/__tests__/fixtures/workspace-gradle/src/test/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-maven/src/test/.keep b/src/__tests__/fixtures/workspace-maven/src/test/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/src/skills/builtin/cmake-workflow.skill.md b/src/skills/builtin/cmake-workflow.skill.md
deleted file mode 100644
index 856f0366..00000000
--- a/src/skills/builtin/cmake-workflow.skill.md
+++ /dev/null
@@ -1,82 +0,0 @@
----
-name: cmake-workflow
-description: CMake-specific build, test, and packaging patterns including preset-based workflows
-version: 1.0.0
-slot: section
----
-
-## CMake Workflow Guidelines
-
-### Project layout conventions
-
-- Source lives in `src/`; headers in `include/`; tests in `tests/` or `test/`.
-- Out-of-source builds go in `build/` (excluded from version control via `.gitignore`).
-- `CMakeLists.txt` at the repository root is the entry point; each subdirectory may have its own `CMakeLists.txt`.
-
-### Preset-based workflow (preferred when `CMakePresets.json` exists)
-
-```bash
-# Configure
-cmake --preset <preset-name>          # e.g. linux-release, debug, ci
-
-# Build
-cmake --build --preset <build-preset> [--parallel $(nproc)]
-
-# Test
-ctest --preset <test-preset> [--output-on-failure]
-```
-
-List available presets:
-```bash
-cmake --list-presets          # configure presets
-cmake --build --list-presets  # build presets
-ctest --list-presets          # test presets
-```
-
-### Classic out-of-source workflow (no presets)
-
-```bash
-# Configure (Release build, Ninja generator recommended)
-cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Release
-
-# Build (parallel)
-cmake --build build --parallel $(nproc)
-
-# Test
-cd build && ctest --output-on-failure
-```
-
-### Dependency management
-
-- **Submodules**: always run `git submodule update --init --recursive` before configuring.
-- **find_package**: ensure system libraries are installed (e.g. `sudo apt install libssl-dev`).
-- **FetchContent / CPM.cmake**: dependencies are downloaded during configure; verify internet access or a local cache is available.
-- **vcpkg / Conan**: run `vcpkg install` or `conan install .` before `cmake -S . -B build`.
-
-### Install-step dependencies pattern
-
-When a project ships a dependency-installation script (e.g. `scripts/install-linux-deps.sh`), run it *before* the CMake configure step:
-
-```bash
-sudo bash scripts/install-linux-deps.sh
-git submodule update --init --recursive
-cmake --preset <preset>
-cmake --build --preset <build-preset> --parallel $(nproc)
-```
-
-### Common CMake variables
-
-| Variable | Purpose |
-|---|---|
-| `CMAKE_BUILD_TYPE` | `Debug`, `Release`, `RelWithDebInfo`, `MinSizeRel` |
-| `CMAKE_INSTALL_PREFIX` | Install destination (default `/usr/local`) |
-| `CMAKE_TOOLCHAIN_FILE` | Cross-compile or vcpkg toolchain |
-| `BUILD_SHARED_LIBS` | `ON` to build shared libraries by default |
-| `CMAKE_EXPORT_COMPILE_COMMANDS` | `ON` to generate `compile_commands.json` for tooling |
-
-### Diagnosing build failures
-
-1. Check the **configure step** output first — missing dependencies abort here.
-2. Look for the **first** error in compiler output; subsequent errors are often cascading.
-3. Enable verbose output to see exact compiler flags: `cmake --build build --verbose` or `VERBOSE=1 make`.
-4. Use `--fresh` flag to force a clean reconfigure: `cmake --fresh --preset <preset>`.