From 07bedbbe29003a9630cdfbc8b8e12b7ae6afb3cb Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Mon, 30 Mar 2026 12:33:29 +0000
Subject: [PATCH 1/3] Initial plan
From f09c2d84dc49e3d3dcd1069a6c53722fc906ad54 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Mon, 30 Mar 2026 12:45:51 +0000
Subject: [PATCH 2/3] =?UTF-8?q?feat:=20task=20decomposition=20research=20?=
=?UTF-8?q?=E2=80=94=20workspace=20detection,=20planner=20context,=20workf?=
=?UTF-8?q?low=20templates?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Agent-Logs-Url: https://github.com/huberp/agentloop/sessions/3a8fc459-0fd4-4ab9-8e11-a59e4d76960b
Co-authored-by: huberp <4027454+huberp@users.noreply.github.com>
---
issues/2.md | 208 ++++++++++++++++++
src/__tests__/builtin-agent-profiles.test.ts | 4 +-
src/__tests__/builtin-skills.test.ts | 4 +-
.../fixtures/workspace-cargo/Cargo.toml | 4 +
.../workspace-cmake-presets/CMakeLists.txt | 0
.../workspace-cmake-presets/CMakePresets.json | 1 +
.../fixtures/workspace-cmake/CMakeLists.txt | 0
.../workspace-gradle-kotlin/build.gradle.kts | 1 +
.../fixtures/workspace-gradle/build.gradle | 1 +
.../fixtures/workspace-maven/pom.xml | 1 +
src/__tests__/workspace.test.ts | 160 ++++++++++++++
src/agents/builtin/build-verify.agent.json | 12 +
src/agents/builtin/test-runner.agent.json | 12 +
src/skills/builtin/build-verify.skill.md | 43 ++++
src/skills/builtin/cmake-workflow.skill.md | 82 +++++++
src/subagents/planner.ts | 14 +-
src/workspace.ts | 124 ++++++++++-
17 files changed, 664 insertions(+), 7 deletions(-)
create mode 100644 issues/2.md
create mode 100644 src/__tests__/fixtures/workspace-cargo/Cargo.toml
create mode 100644 src/__tests__/fixtures/workspace-cmake-presets/CMakeLists.txt
create mode 100644 src/__tests__/fixtures/workspace-cmake-presets/CMakePresets.json
create mode 100644 src/__tests__/fixtures/workspace-cmake/CMakeLists.txt
create mode 100644 src/__tests__/fixtures/workspace-gradle-kotlin/build.gradle.kts
create mode 100644 src/__tests__/fixtures/workspace-gradle/build.gradle
create mode 100644 src/__tests__/fixtures/workspace-maven/pom.xml
create mode 100644 src/agents/builtin/build-verify.agent.json
create mode 100644 src/agents/builtin/test-runner.agent.json
create mode 100644 src/skills/builtin/build-verify.skill.md
create mode 100644 src/skills/builtin/cmake-workflow.skill.md
diff --git a/issues/2.md b/issues/2.md
new file mode 100644
index 00000000..402e5406
--- /dev/null
+++ b/issues/2.md
@@ -0,0 +1,208 @@
+## Research: Task Planner and Task Decomposition for Coding Agents
+
+### 1. Problem Statement
+
+Modern AI coding agents need to tackle tasks that span multiple steps, require diverse tools, and benefit from specialised domain knowledge at each step. The key challenge is **task decomposition**: how should a high-level goal (e.g. "add input validation to all POST handlers") be broken into concrete, executable steps that an agent can carry out reliably?
+
+A related challenge is **workflow templates**: many coding workflows (build, test, lint, release) have a fixed *shape* but vary in their concrete commands depending on the workspace. Can these patterns be captured once and reused across projects?
+
+---
+
+### 2. Baseline: The `plan-and-run` Loop in agentloop
+
+agentloop already ships a layered planning architecture:
+
+| Component | Location | Role |
+|---|---|---|
+| `generatePlan` | `src/subagents/planner.ts` | LLM-powered decomposition of a goal into `PlanStep[]` |
+| `refinePlan` | `src/subagents/planner.ts` | Corrects a plan that references unknown tools |
+| `validatePlan` | `src/subagents/planner.ts` | Checks that all tool names in the plan are registered |
+| `executePlan` | `src/orchestrator.ts` | Runs steps in sequence; supports `retry/skip/abort` on failure and checkpoint/resume |
+| `plan` tool | `src/tools/plan.ts` | Exposes plan generation to the agent loop as a callable tool |
+| `plan-and-run` tool | `src/tools/plan-and-run.ts` | Combines generation + execution in one tool call |
+
+The planner runs as a **tool-free subagent**: it receives workspace context and a list of available tools, then returns a JSON plan. The orchestrator dispatches each step to a `runSubagent` call (or `SubagentManager` for complex steps) with an iteration budget derived from `estimatedComplexity`.
+
+**Key insight already present in agentloop**: each `PlanStep` carries an optional `agentProfile` field. The orchestrator activates the named profile (model, temperature, tool subset) for that step. This enables per-step specialisation without a separate orchestration framework.
+
+---
+
+### 3. What Other Frameworks Do
+
+#### 3.1 LangGraph (LangChain)
+- Models agent behaviour as a **directed graph** of nodes (LLM calls, tool calls) with conditional edges.
+- Supports cycles (retry loops), parallel fan-out/fan-in, and human-in-the-loop interrupts.
+- Templates are *graph patterns* stored as reusable subgraphs.
+- Complexity: full graph authoring required for every new workflow shape.
+
+#### 3.2 AutoGen (Microsoft)
+- Multi-agent conversation: a **Planner** agent, a **Coder** agent, and an **Executor** agent exchange messages until the task is done.
+- Task decomposition happens in natural language — the Planner emits step descriptions that the Coder implements.
+- Workflow templates are **system prompts** for each role, often provided in a configuration YAML.
+- Strength: easy to add domain-expert agents. Weakness: conversation history grows rapidly; quality depends on message-passing discipline.
+
+#### 3.3 CrewAI
+- Defines **Crew** (team of agents), **Agents** (role + backstory + tools), and **Tasks** (description + expected output + dependencies).
+- Supports sequential and hierarchical execution; tasks can pass their output as context to dependent tasks.
+- Workflow templates are *Crew + Task YAML configurations* that can be parameterised and re-instantiated for different inputs.
+- Strong alignment with the "workflow template" concept in this issue: a `BuildVerifyCrew` YAML is a reusable template instantiated per workspace.
+
+#### 3.4 OpenAI Assistants + Structured Outputs
+- Persistent thread context allows multi-turn tasks without re-injecting history.
+- `run_step` objects provide a built-in audit trail of each tool call and its output.
+- Templates are **Assistant instructions** (system prompt) combined with few-shot examples in the thread.
+- Limitation: tied to the OpenAI API; no local model support.
+
+#### 3.5 Copilot Coding Agent (GitHub Copilot)
+The example in this issue shows a subtask with:
+```json
+{
+ "name": "build-verify",
+ "agent_type": "task",
+ "description": "Build the plugin to verify changes",
+ "prompt": "...\nSteps:\n1. Run: sudo bash scripts/install-linux-deps.sh\n2. Run: git submodule update --init --recursive\n3. Run: cmake --preset linux-release\n4. Run: cmake --build --preset linux-build -j2\n\nReport whether the build succeeded or failed..."
+}
+```
+
+Key observations:
+- The template name (`build-verify`) is **stable and reusable** across tasks.
+- The concrete steps (cmake preset names, script paths) are **workspace-specific** and were derived from workspace knowledge.
+- Instantiation happens once, at workspace-setup time — not re-derived on every task.
+- This is equivalent to agentloop's `agentProfile` + `skill` combination, but the steps are baked into the prompt string rather than generated by a planner.
+
+---
+
+### 4. Template Taxonomy for Coding Agents
+
+Based on the above analysis, coding workflow templates fall into three categories:
+
+#### Category A — Build Lifecycle Templates
+Fixed structure, workspace-specific commands:
+
+| Template | Shape | Workspace-specific parts |
+|---|---|---|
+| `build-verify` | configure → compile → report | preset name, script paths, parallelism flag |
+| `clean-build` | clean → configure → compile | build directory, preset/profile |
+| `release-package` | build → test → package → sign | packaging format, signing key |
+
+#### Category B — Quality Gate Templates
+Fixed checklist, tool-specific commands:
+
+| Template | Shape | Workspace-specific parts |
+|---|---|---|
+| `test-and-fix` | run tests → parse failures → locate code → fix → re-run | test runner command, test output format |
+| `lint-and-format` | run linter → parse output → apply fixes → re-verify | linter binary, fix flags |
+| `security-scan` | run scanner → parse findings → generate report | scanner CLI, severity threshold |
+
+#### Category C — Development Workflow Templates
+Higher-level patterns:
+
+| Template | Shape | Notes |
+|---|---|---|
+| `feature-branch` | branch → implement → test → PR | Uses git tools + planner |
+| `dependency-update` | audit → update → test → commit | Integrates vulnerability check |
+| `hotfix` | branch from tag → apply fix → test → backport | Requires git-log, cherry-pick |
+
+---
+
+### 5. How agentloop Can Implement Workflow Templates
+
+agentloop's existing primitives map cleanly onto the template concept:
+
+#### 5.1 Templates as Agent Profiles + Skills (recommended)
+
+A workflow template = **agent profile** (what tools, model, iteration budget) + **skill** (domain knowledge, step sequence, error heuristics).
+
+Example — `build-verify` profile (`src/agents/builtin/build-verify.agent.json`):
+```json
+{
+ "name": "build-verify",
+ "description": "Build verification agent — compiles the workspace and reports success or failure",
+ "temperature": 0.1,
+ "skills": ["build-verify"],
+ "tools": ["shell", "file-read", "file-list"],
+ "maxIterations": 10
+}
+```
+
+The paired `build-verify` skill (`src/skills/builtin/build-verify.skill.md`) injects:
+- Step sequence (identify build system → install deps → configure → compile → report)
+- Error triage heuristics (linker errors, missing headers, stale cache)
+- Parallelism flags per build tool
+
+The planner can then annotate a step with `"agentProfile": "build-verify"` and the orchestrator will activate the matching profile for that step — automatically binding the right skill, tool subset, and temperature.
+
+#### 5.2 Templates as Planner Context (workspace-aware instantiation)
+
+The planner prompt includes `workspaceInfo` fields including the detected lifecycle commands (`buildCommand`, `testCommand`, `lintCommand`). This allows the planner to produce **concrete, workspace-specific steps** in one shot:
+
+```
+Workspace: language=cmake, packageManager=cmake,
+ build="cmake --preset linux-release && cmake --build --preset linux-build",
+ test="ctest --preset linux-test"
+```
+
+The planner output then directly embeds the correct commands rather than using a generic placeholder.
+
+#### 5.3 Template Instantiation: Agent vs Static
+
+| Approach | When to use | Trade-offs |
+|---|---|---|
+| **Planner-time instantiation** (current) | Novel tasks, unknown workspaces | Flexible, adapts to workspace; requires LLM call |
+| **Profile+skill pre-configuration** (new) | Recurring workflows (CI, build-verify) | Fast, deterministic, version-controlled; less adaptive |
+| **Hybrid** (recommended) | Plan overall task, but use pre-defined profiles per step | Best of both worlds |
+
+The hybrid approach is already supported: the planner annotates `agentProfile` on steps, and the orchestrator activates the profile. Adding skills that encode the step sequence means the profile-activated agent "knows" the right procedure without the planner having to enumerate every sub-step.
+
+---
+
+### 6. Concrete Example: CMake Build-Verify Flow
+
+**Goal**: "Build the plugin to verify changes compile correctly"
+
+**Planner output** (with workspace context `build="cmake --preset linux-release && cmake --build --preset linux-build -j2"`):
+
+```json
+{
+ "steps": [
+ {
+ "description": "Install Linux build dependencies",
+ "toolsNeeded": ["shell"],
+ "estimatedComplexity": "low",
+ "agentProfile": "devops"
+ },
+ {
+ "description": "Update git submodules",
+ "toolsNeeded": ["shell"],
+ "estimatedComplexity": "low",
+ "agentProfile": "devops"
+ },
+ {
+ "description": "Build the project using cmake --preset linux-release && cmake --build --preset linux-build -j2 and report compiler output",
+ "toolsNeeded": ["shell"],
+ "estimatedComplexity": "medium",
+ "agentProfile": "build-verify"
+ }
+ ]
+}
+```
+
+The `build-verify` agent profile activates the `build-verify` skill, which provides the step sequence and error triage guidance. The concrete commands come from `workspaceInfo.buildCommand`, injected into the planner prompt.
+
+---
+
+### 7. Recommendations and Gaps Addressed
+
+| Gap | Solution implemented |
+|---|---|
+| Planner didn't know lifecycle commands | ✅ `buildPlannerTask` now includes `build`, `test`, `lint` commands from `WorkspaceInfo` |
+| Only Node/Python/Go workspace detection | ✅ Added CMake, Rust/Cargo, Gradle, Maven analyzers in `workspace.ts` |
+| No build-workflow agent profile | ✅ `build-verify.agent.json` and `test-runner.agent.json` added |
+| No build-workflow skill | ✅ `build-verify.skill.md` and `cmake-workflow.skill.md` added |
+
+### 8. Remaining Open Questions
+
+1. **Template registry**: Should templates be discoverable at runtime (e.g. `list-templates` tool) so the planner can reference them by name? The current profile registry partially serves this role.
+2. **Workspace-once vs task-every-time**: For expensive workspace analysis (submodule init, dependency install), should a "workspace setup" template run once at session start and cache results? This aligns with CrewAI's `before_kickoff` hook concept.
+3. **Multi-repo / monorepo**: `analyzeWorkspace` currently detects one build system per root. Monorepos with mixed build systems (e.g. a CMake C++ library + a Node.js frontend) need a recursive scan.
+4. **Template versioning**: When the workspace changes (new preset, renamed script), how are baked templates kept in sync? A solution is to keep commands in `WorkspaceInfo` (auto-detected) rather than hard-coding them in profile prompts.
diff --git a/src/__tests__/builtin-agent-profiles.test.ts b/src/__tests__/builtin-agent-profiles.test.ts
index ed5fa0d7..1078a492 100644
--- a/src/__tests__/builtin-agent-profiles.test.ts
+++ b/src/__tests__/builtin-agent-profiles.test.ts
@@ -24,8 +24,8 @@ beforeAll(async () => {
});
describe("builtin agent profiles", () => {
- it("loads exactly 5 builtin profiles", () => {
- expect(registry.list()).toHaveLength(5);
+ it("loads exactly 7 builtin profiles", () => {
+ expect(registry.list()).toHaveLength(7);
});
it("coder profile has name === 'coder' and model === 'gpt-4o'", () => {
diff --git a/src/__tests__/builtin-skills.test.ts b/src/__tests__/builtin-skills.test.ts
index 2d6e48d7..b2107d10 100644
--- a/src/__tests__/builtin-skills.test.ts
+++ b/src/__tests__/builtin-skills.test.ts
@@ -20,9 +20,11 @@ describe("built-in skill library", () => {
"test-writer",
"git-workflow",
"security-auditor",
+ "build-verify",
+ "cmake-workflow",
];
- it("loads all 5 built-in skills", () => {
+ it("loads all 7 built-in skills", () => {
const names = registry.list().map((s) => s.name);
for (const name of BUILTIN_NAMES) {
expect(names).toContain(name);
diff --git a/src/__tests__/fixtures/workspace-cargo/Cargo.toml b/src/__tests__/fixtures/workspace-cargo/Cargo.toml
new file mode 100644
index 00000000..965b5937
--- /dev/null
+++ b/src/__tests__/fixtures/workspace-cargo/Cargo.toml
@@ -0,0 +1,4 @@
+[package]
+name = "my-app"
+version = "0.1.0"
+edition = "2021"
diff --git a/src/__tests__/fixtures/workspace-cmake-presets/CMakeLists.txt b/src/__tests__/fixtures/workspace-cmake-presets/CMakeLists.txt
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-cmake-presets/CMakePresets.json b/src/__tests__/fixtures/workspace-cmake-presets/CMakePresets.json
new file mode 100644
index 00000000..62e6c2f0
--- /dev/null
+++ b/src/__tests__/fixtures/workspace-cmake-presets/CMakePresets.json
@@ -0,0 +1 @@
+{"version":3,"cmakeMinimumRequired":{"major":3,"minor":21},"configurePresets":[{"name":"default","binaryDir":"build"}],"buildPresets":[{"name":"default","configurePreset":"default"}],"testPresets":[{"name":"default","configurePreset":"default"}]}
diff --git a/src/__tests__/fixtures/workspace-cmake/CMakeLists.txt b/src/__tests__/fixtures/workspace-cmake/CMakeLists.txt
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-gradle-kotlin/build.gradle.kts b/src/__tests__/fixtures/workspace-gradle-kotlin/build.gradle.kts
new file mode 100644
index 00000000..5b1dae2a
--- /dev/null
+++ b/src/__tests__/fixtures/workspace-gradle-kotlin/build.gradle.kts
@@ -0,0 +1 @@
+plugins { kotlin("jvm") version "1.9.0" }
diff --git a/src/__tests__/fixtures/workspace-gradle/build.gradle b/src/__tests__/fixtures/workspace-gradle/build.gradle
new file mode 100644
index 00000000..b95276ac
--- /dev/null
+++ b/src/__tests__/fixtures/workspace-gradle/build.gradle
@@ -0,0 +1 @@
+plugins { id("java") }
diff --git a/src/__tests__/fixtures/workspace-maven/pom.xml b/src/__tests__/fixtures/workspace-maven/pom.xml
new file mode 100644
index 00000000..12ac61cc
--- /dev/null
+++ b/src/__tests__/fixtures/workspace-maven/pom.xml
@@ -0,0 +1 @@
+4.0.0com.examplemy-app1.0
diff --git a/src/__tests__/workspace.test.ts b/src/__tests__/workspace.test.ts
index 718b6169..df50e200 100644
--- a/src/__tests__/workspace.test.ts
+++ b/src/__tests__/workspace.test.ts
@@ -118,3 +118,163 @@ describe("analyzeWorkspace — git detection", () => {
expect(info.gitInitialized).toBe(true);
});
});
+
+describe("analyzeWorkspace — Rust/Cargo project", () => {
+ const root = path.join(fixturesDir, "workspace-cargo");
+
+ let info: WorkspaceInfo;
+ beforeAll(async () => {
+ info = await analyzeWorkspace(root);
+ });
+
+ it("detects language as 'rust'", () => {
+ expect(info.language).toBe("rust");
+ });
+
+ it("uses 'cargo' as the package manager", () => {
+ expect(info.packageManager).toBe("cargo");
+ });
+
+ it("defaults the build command to 'cargo build'", () => {
+ expect(info.buildCommand).toBe("cargo build");
+ });
+
+ it("defaults the test command to 'cargo test'", () => {
+ expect(info.testCommand).toBe("cargo test");
+ });
+
+ it("defaults the lint command to 'cargo clippy'", () => {
+ expect(info.lintCommand).toBe("cargo clippy");
+ });
+
+ it("reports hasTests as true when a tests/ directory exists", () => {
+ expect(info.hasTests).toBe(true);
+ });
+});
+
+describe("analyzeWorkspace — CMake project (no presets)", () => {
+ const root = path.join(fixturesDir, "workspace-cmake");
+
+ let info: WorkspaceInfo;
+ beforeAll(async () => {
+ info = await analyzeWorkspace(root);
+ });
+
+ it("detects language as 'cmake'", () => {
+ expect(info.language).toBe("cmake");
+ });
+
+ it("uses 'cmake' as the package manager", () => {
+ expect(info.packageManager).toBe("cmake");
+ });
+
+ it("uses classic out-of-source build command when no presets file is present", () => {
+ expect(info.buildCommand).toBe("cmake -S . -B build && cmake --build build");
+ });
+
+ it("defaults the test command to ctest", () => {
+ expect(info.testCommand).toBe("ctest --output-on-failure");
+ });
+
+ it("reports hasTests as true when a tests/ directory exists", () => {
+ expect(info.hasTests).toBe(true);
+ });
+});
+
+describe("analyzeWorkspace — CMake project (with CMakePresets.json)", () => {
+ const root = path.join(fixturesDir, "workspace-cmake-presets");
+
+ let info: WorkspaceInfo;
+ beforeAll(async () => {
+ info = await analyzeWorkspace(root);
+ });
+
+ it("detects language as 'cmake'", () => {
+ expect(info.language).toBe("cmake");
+ });
+
+ it("uses preset-based build command when CMakePresets.json is present", () => {
+ expect(info.buildCommand).toBe(
+ "cmake --preset default && cmake --build --preset default"
+ );
+ });
+
+ it("uses preset-based test command when CMakePresets.json is present", () => {
+ expect(info.testCommand).toBe("ctest --preset default");
+ });
+});
+
+describe("analyzeWorkspace — Gradle (Java) project", () => {
+ const root = path.join(fixturesDir, "workspace-gradle");
+
+ let info: WorkspaceInfo;
+ beforeAll(async () => {
+ info = await analyzeWorkspace(root);
+ });
+
+ it("detects language as 'java'", () => {
+ expect(info.language).toBe("java");
+ });
+
+ it("uses 'gradle' as the package manager", () => {
+ expect(info.packageManager).toBe("gradle");
+ });
+
+ it("uses 'gradle build' as the build command (no gradlew wrapper)", () => {
+ expect(info.buildCommand).toBe("gradle build");
+ });
+
+ it("uses 'gradle test' as the test command", () => {
+ expect(info.testCommand).toBe("gradle test");
+ });
+
+ it("reports hasTests as true when src/test exists", () => {
+ expect(info.hasTests).toBe(true);
+ });
+});
+
+describe("analyzeWorkspace — Gradle (Kotlin DSL) project", () => {
+ const root = path.join(fixturesDir, "workspace-gradle-kotlin");
+
+ let info: WorkspaceInfo;
+ beforeAll(async () => {
+ info = await analyzeWorkspace(root);
+ });
+
+ it("detects language as 'kotlin'", () => {
+ expect(info.language).toBe("kotlin");
+ });
+
+ it("uses 'gradle' as the package manager", () => {
+ expect(info.packageManager).toBe("gradle");
+ });
+});
+
+describe("analyzeWorkspace — Maven project", () => {
+ const root = path.join(fixturesDir, "workspace-maven");
+
+ let info: WorkspaceInfo;
+ beforeAll(async () => {
+ info = await analyzeWorkspace(root);
+ });
+
+ it("detects language as 'java'", () => {
+ expect(info.language).toBe("java");
+ });
+
+ it("uses 'maven' as the package manager", () => {
+ expect(info.packageManager).toBe("maven");
+ });
+
+ it("uses 'mvn package -DskipTests' as the build command (no wrapper)", () => {
+ expect(info.buildCommand).toBe("mvn package -DskipTests");
+ });
+
+ it("uses 'mvn test' as the test command", () => {
+ expect(info.testCommand).toBe("mvn test");
+ });
+
+ it("reports hasTests as true when src/test exists", () => {
+ expect(info.hasTests).toBe(true);
+ });
+});
diff --git a/src/agents/builtin/build-verify.agent.json b/src/agents/builtin/build-verify.agent.json
new file mode 100644
index 00000000..6fec92e5
--- /dev/null
+++ b/src/agents/builtin/build-verify.agent.json
@@ -0,0 +1,12 @@
+{
+ "name": "build-verify",
+ "description": "Build verification agent — compiles the workspace and reports success or failure with compiler diagnostics",
+ "version": "1.0.0",
+ "temperature": 0.1,
+ "skills": ["build-verify"],
+ "tools": ["shell", "file-read", "file-list"],
+ "maxIterations": 10,
+ "constraints": {
+ "requireConfirmation": []
+ }
+}
diff --git a/src/agents/builtin/test-runner.agent.json b/src/agents/builtin/test-runner.agent.json
new file mode 100644
index 00000000..64a992ea
--- /dev/null
+++ b/src/agents/builtin/test-runner.agent.json
@@ -0,0 +1,12 @@
+{
+ "name": "test-runner",
+ "description": "Test execution agent — runs the project test suite, reports failures, and suggests targeted fixes",
+ "version": "1.0.0",
+ "temperature": 0.2,
+ "skills": ["test-writer"],
+ "tools": ["shell", "file-read", "file-write", "file-edit", "file-list", "code-search"],
+ "maxIterations": 20,
+ "constraints": {
+ "requireConfirmation": []
+ }
+}
diff --git a/src/skills/builtin/build-verify.skill.md b/src/skills/builtin/build-verify.skill.md
new file mode 100644
index 00000000..699a76f5
--- /dev/null
+++ b/src/skills/builtin/build-verify.skill.md
@@ -0,0 +1,43 @@
+---
+name: build-verify
+description: Workflow guidance for verifying that a project compiles and links correctly
+version: 1.0.0
+slot: section
+---
+
+## Build Verification Workflow
+
+The goal of this workflow is to confirm the project compiles cleanly and to surface any errors with actionable context.
+
+### Step sequence
+
+1. **Identify the build system** — inspect the workspace root for `CMakeLists.txt`, `Cargo.toml`, `package.json`, `build.gradle`, or `pom.xml` to determine which build tool to invoke.
+2. **Install / update dependencies** — run the dependency installation step *before* building:
+ - CMake: `git submodule update --init --recursive` (if submodules present)
+ - Node: `npm ci` or `yarn install --frozen-lockfile`
+ - Rust: `cargo fetch`
+ - Gradle: `./gradlew dependencies` (optional)
+3. **Configure the build** (if required):
+ - CMake: `cmake -S . -B build [-DCMAKE_BUILD_TYPE=Release]` or `cmake --preset `
+ - Gradle: no separate configure step
+4. **Compile**:
+ - CMake: `cmake --build build [--parallel $(nproc)]` or `cmake --build --preset `
+ - Node: `npm run build`
+ - Rust: `cargo build [--release]`
+ - Gradle: `./gradlew assemble` (compile only, no tests)
+ - Maven: `mvn package -DskipTests`
+5. **Report** — emit a structured summary: overall status (success/failure), number of errors and warnings, and the first 20 lines of compiler output for failures.
+
+### Error triage heuristics
+
+- **Linker errors** (`undefined reference`, `unresolved symbol`): check `CMakeLists.txt` for missing `target_link_libraries` entries; for Gradle check `dependencies` block.
+- **Missing headers / imports**: confirm that all required packages are declared in the manifest and that dependency installation succeeded in step 2.
+- **Type / compilation errors** in generated code: regenerate protobuf, Thrift, or OpenAPI sources before building.
+- **Out-of-date build cache**: perform a clean build (`rm -rf build && cmake …` or `cargo clean && cargo build`) to rule out stale artifacts.
+
+### Parallel build flag
+
+When invoking multi-core builds, pass a parallelism flag to keep wall-clock time low:
+- CMake/Ninja: `--parallel $(nproc)` or `-j$(nproc)`
+- Maven: `-T 1C` (one thread per CPU core)
+- Gradle: `--parallel`
diff --git a/src/skills/builtin/cmake-workflow.skill.md b/src/skills/builtin/cmake-workflow.skill.md
new file mode 100644
index 00000000..856f0366
--- /dev/null
+++ b/src/skills/builtin/cmake-workflow.skill.md
@@ -0,0 +1,82 @@
+---
+name: cmake-workflow
+description: CMake-specific build, test, and packaging patterns including preset-based workflows
+version: 1.0.0
+slot: section
+---
+
+## CMake Workflow Guidelines
+
+### Project layout conventions
+
+- Source lives in `src/`; headers in `include/`; tests in `tests/` or `test/`.
+- Out-of-source builds go in `build/` (excluded from version control via `.gitignore`).
+- `CMakeLists.txt` at the repository root is the entry point; each subdirectory may have its own `CMakeLists.txt`.
+
+### Preset-based workflow (preferred when `CMakePresets.json` exists)
+
+```bash
+# Configure
+cmake --preset # e.g. linux-release, debug, ci
+
+# Build
+cmake --build --preset [--parallel $(nproc)]
+
+# Test
+ctest --preset [--output-on-failure]
+```
+
+List available presets:
+```bash
+cmake --list-presets # configure presets
+cmake --build --list-presets # build presets
+ctest --list-presets # test presets
+```
+
+### Classic out-of-source workflow (no presets)
+
+```bash
+# Configure (Release build, Ninja generator recommended)
+cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Release
+
+# Build (parallel)
+cmake --build build --parallel $(nproc)
+
+# Test
+cd build && ctest --output-on-failure
+```
+
+### Dependency management
+
+- **Submodules**: always run `git submodule update --init --recursive` before configuring.
+- **find_package**: ensure system libraries are installed (e.g. `sudo apt install libssl-dev`).
+- **FetchContent / CPM.cmake**: dependencies are downloaded during configure; verify internet access or a local cache is available.
+- **vcpkg / Conan**: run `vcpkg install` or `conan install .` before `cmake -S . -B build`.
+
+### Install-step dependencies pattern
+
+When a project ships a dependency-installation script (e.g. `scripts/install-linux-deps.sh`), run it *before* the CMake configure step:
+
+```bash
+sudo bash scripts/install-linux-deps.sh
+git submodule update --init --recursive
+cmake --preset
+cmake --build --preset --parallel $(nproc)
+```
+
+### Common CMake variables
+
+| Variable | Purpose |
+|---|---|
+| `CMAKE_BUILD_TYPE` | `Debug`, `Release`, `RelWithDebInfo`, `MinSizeRel` |
+| `CMAKE_INSTALL_PREFIX` | Install destination (default `/usr/local`) |
+| `CMAKE_TOOLCHAIN_FILE` | Cross-compile or vcpkg toolchain |
+| `BUILD_SHARED_LIBS` | `ON` to build shared libraries by default |
+| `CMAKE_EXPORT_COMPILE_COMMANDS` | `ON` to generate `compile_commands.json` for tooling |
+
+### Diagnosing build failures
+
+1. Check the **configure step** output first — missing dependencies abort here.
+2. Look for the **first** error in compiler output; subsequent errors are often cascading.
+3. Enable verbose output to see exact compiler flags: `cmake --build build --verbose` or `VERBOSE=1 make`.
+4. Use `--fresh` flag to force a clean reconfigure: `cmake --fresh --preset `.
diff --git a/src/subagents/planner.ts b/src/subagents/planner.ts
index 82c77be1..624c6c56 100644
--- a/src/subagents/planner.ts
+++ b/src/subagents/planner.ts
@@ -83,8 +83,18 @@ function buildPlannerTask(
let result =
`Task: ${task}\n` +
`Workspace: language=${workspaceInfo.language}, framework=${workspaceInfo.framework}, ` +
- `packageManager=${workspaceInfo.packageManager}, gitInitialized=${workspaceInfo.gitInitialized}\n` +
- `Available tools: ${toolList}`;
+ `packageManager=${workspaceInfo.packageManager}, gitInitialized=${workspaceInfo.gitInitialized}`;
+
+ // Include lifecycle commands so the planner can generate concrete, workspace-specific steps
+ const lifecycleLines: string[] = [];
+ if (workspaceInfo.buildCommand) lifecycleLines.push(`build="${workspaceInfo.buildCommand}"`);
+ if (workspaceInfo.testCommand) lifecycleLines.push(`test="${workspaceInfo.testCommand}"`);
+ if (workspaceInfo.lintCommand) lifecycleLines.push(`lint="${workspaceInfo.lintCommand}"`);
+ if (lifecycleLines.length > 0) {
+ result += `, ${lifecycleLines.join(", ")}`;
+ }
+
+ result += `\nAvailable tools: ${toolList}`;
if (availableProfiles && availableProfiles.length > 0) {
const profileList = availableProfiles.map((p) => `${p.name}: ${p.description}`).join("; ");
result += `\nAvailable agent profiles: ${profileList}`;
diff --git a/src/workspace.ts b/src/workspace.ts
index 84b19524..9ce29008 100644
--- a/src/workspace.ts
+++ b/src/workspace.ts
@@ -3,11 +3,11 @@ import * as path from "path";
/** Structured information about the project workspace. */
export interface WorkspaceInfo {
- /** Primary language detected: 'node', 'python', 'go', or 'unknown'. */
+ /** Primary language detected: 'node', 'python', 'go', 'rust', 'cmake', or 'unknown'. */
language: string;
/** Framework detected from dependencies (e.g. 'react', 'django'), or 'none'. */
framework: string;
- /** Package manager inferred from lock files or language (e.g. 'npm', 'pip'). */
+ /** Package manager inferred from lock files or language (e.g. 'npm', 'pip', 'cargo', 'gradle'). */
packageManager: string;
/** True if a test directory or test script was found. */
hasTests: boolean;
@@ -168,6 +168,115 @@ async function analyzeGo(rootPath: string): Promise> {
return info;
}
+/**
+ * Analyse a Rust/Cargo workspace.
+ * Reads Cargo.toml for basic metadata and checks for a `tests/` directory.
+ */
+async function analyzeCargo(rootPath: string): Promise> {
+ const info: Partial = {
+ language: "rust",
+ packageManager: "cargo",
+ testCommand: "cargo test",
+ lintCommand: "cargo clippy",
+ buildCommand: "cargo build",
+ };
+
+ // Override defaults with Makefile targets when available
+ const make = await parseMakefileTargets(rootPath);
+ if (make["test"]) info.testCommand = make["test"];
+ if (make["lint"]) info.lintCommand = make["lint"];
+ if (make["build"]) info.buildCommand = make["build"];
+
+ // Consider tests present if a tests/ directory or any #[cfg(test)] usage exists
+ info.hasTests =
+ (await exists(path.join(rootPath, "tests"))) ||
+ (await exists(path.join(rootPath, "src", "tests")));
+
+ return info;
+}
+
+/**
+ * Analyse a CMake workspace.
+ * Reads CMakeLists.txt for basic metadata and suggests cmake preset commands
+ * when a CMakePresets.json file is present.
+ */
+async function analyzeCMake(rootPath: string): Promise> {
+ const info: Partial = {
+ language: "cmake",
+ packageManager: "cmake",
+ testCommand: "ctest --output-on-failure",
+ lintCommand: "",
+ buildCommand: "cmake --build build",
+ };
+
+ // When CMakePresets.json is present, recommend the preset-based workflow
+ if (await exists(path.join(rootPath, "CMakePresets.json"))) {
+ info.buildCommand = "cmake --preset default && cmake --build --preset default";
+ info.testCommand = "ctest --preset default";
+ } else {
+ // Classic out-of-source build pattern
+ info.buildCommand = "cmake -S . -B build && cmake --build build";
+ }
+
+ // Override with Makefile targets when available (common for CMake super-builds)
+ const make = await parseMakefileTargets(rootPath);
+ if (make["test"]) info.testCommand = make["test"];
+ if (make["build"]) info.buildCommand = make["build"];
+
+ // Detect tests by presence of a CTestTestfile, tests/ directory, or test subdirectory
+ info.hasTests =
+ (await exists(path.join(rootPath, "CTestTestfile.cmake"))) ||
+ (await exists(path.join(rootPath, "tests"))) ||
+ (await exists(path.join(rootPath, "test")));
+
+ return info;
+}
+
+/**
+ * Analyse a Gradle (Java/Kotlin/Android) workspace.
+ */
+async function analyzeGradle(rootPath: string): Promise> {
+ // Prefer ./gradlew wrapper when present
+ const gradleCmd = (await exists(path.join(rootPath, "gradlew"))) ? "./gradlew" : "gradle";
+
+ const info: Partial = {
+ language: "java",
+ packageManager: "gradle",
+ testCommand: `${gradleCmd} test`,
+ lintCommand: `${gradleCmd} check`,
+ buildCommand: `${gradleCmd} build`,
+ };
+
+ // Check for Kotlin DSL (build.gradle.kts) to refine the language label
+ if (await exists(path.join(rootPath, "build.gradle.kts"))) {
+ info.language = "kotlin";
+ }
+
+ info.hasTests = (await exists(path.join(rootPath, "src", "test")));
+
+ return info;
+}
+
+/**
+ * Analyse a Maven (Java) workspace.
+ */
+async function analyzeMaven(rootPath: string): Promise> {
+ // Prefer ./mvnw wrapper when present
+ const mvnCmd = (await exists(path.join(rootPath, "mvnw"))) ? "./mvnw" : "mvn";
+
+ const info: Partial = {
+ language: "java",
+ packageManager: "maven",
+ testCommand: `${mvnCmd} test`,
+ lintCommand: `${mvnCmd} verify`,
+ buildCommand: `${mvnCmd} package -DskipTests`,
+ };
+
+ info.hasTests = (await exists(path.join(rootPath, "src", "test")));
+
+ return info;
+}
+
/**
* Analyse the workspace rooted at `rootPath` and return a `WorkspaceInfo`
* object. Language is detected via well-known indicator files; commands are
@@ -201,6 +310,17 @@ export async function analyzeWorkspace(rootPath: string): Promise
langInfo = await analyzePython(rootPath);
} else if (await exists(path.join(rootPath, "go.mod"))) {
langInfo = await analyzeGo(rootPath);
+ } else if (await exists(path.join(rootPath, "Cargo.toml"))) {
+ langInfo = await analyzeCargo(rootPath);
+ } else if (await exists(path.join(rootPath, "CMakeLists.txt"))) {
+ langInfo = await analyzeCMake(rootPath);
+ } else if (
+ (await exists(path.join(rootPath, "build.gradle"))) ||
+ (await exists(path.join(rootPath, "build.gradle.kts")))
+ ) {
+ langInfo = await analyzeGradle(rootPath);
+ } else if (await exists(path.join(rootPath, "pom.xml"))) {
+ langInfo = await analyzeMaven(rootPath);
}
return { ...base, ...langInfo };
From 3e615c933263fb61eaf8f8d9917aeede284cd30c Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Mon, 30 Mar 2026 13:25:37 +0000
Subject: [PATCH 3/3] =?UTF-8?q?refactor:=20rewrite=20research=20doc=20to?=
=?UTF-8?q?=20focus=20on=20dynamic=20intent=E2=86=92action=20transformatio?=
=?UTF-8?q?n,=20remove=20cmake-specific=20skill?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Agent-Logs-Url: https://github.com/huberp/agentloop/sessions/d37e4f03-c06f-4113-a6da-f608dbd6b3f4
Co-authored-by: huberp <4027454+huberp@users.noreply.github.com>
---
issues/2.md | 400 +++++++++++-------
src/__tests__/builtin-skills.test.ts | 3 +-
.../fixtures/workspace-cargo/tests/.keep | 0
.../fixtures/workspace-cmake/tests/.keep | 0
.../workspace-gradle-kotlin/src/test/.keep | 0
.../fixtures/workspace-gradle/src/test/.keep | 0
.../fixtures/workspace-maven/src/test/.keep | 0
src/skills/builtin/cmake-workflow.skill.md | 82 ----
8 files changed, 246 insertions(+), 239 deletions(-)
create mode 100644 src/__tests__/fixtures/workspace-cargo/tests/.keep
create mode 100644 src/__tests__/fixtures/workspace-cmake/tests/.keep
create mode 100644 src/__tests__/fixtures/workspace-gradle-kotlin/src/test/.keep
create mode 100644 src/__tests__/fixtures/workspace-gradle/src/test/.keep
create mode 100644 src/__tests__/fixtures/workspace-maven/src/test/.keep
delete mode 100644 src/skills/builtin/cmake-workflow.skill.md
diff --git a/issues/2.md b/issues/2.md
index 402e5406..67177cac 100644
--- a/issues/2.md
+++ b/issues/2.md
@@ -1,208 +1,298 @@
-## Research: Task Planner and Task Decomposition for Coding Agents
+## Research: Intent-to-Action Transformation — How a Generic Workflow Step Becomes Concrete
-### 1. Problem Statement
+### 1. The Core Problem
-Modern AI coding agents need to tackle tasks that span multiple steps, require diverse tools, and benefit from specialised domain knowledge at each step. The key challenge is **task decomposition**: how should a high-level goal (e.g. "add input validation to all POST handlers") be broken into concrete, executable steps that an agent can carry out reliably?
+A coding agent receives a generic intent such as **"verify-build"**. This is a template name
+that means "compile the project and confirm whether the build succeeds or fails". But the
+_concrete steps_ vary entirely by workspace:
-A related challenge is **workflow templates**: many coding workflows (build, test, lint, release) have a fixed *shape* but vary in their concrete commands depending on the workspace. Can these patterns be captured once and reused across projects?
+- For a CMake project with presets: `cmake --preset linux-release && cmake --build --preset linux-build -j2`
+- For a Node.js project: `npm ci && npm run build`
+- For a Rust project: `cargo build`
+- For a Gradle project: `./gradlew assemble`
----
+The question is: **what is the correct point in the machinery to perform this transformation,
+and which components are responsible for deriving the concrete steps?**
-### 2. Baseline: The `plan-and-run` Loop in agentloop
+---
-agentloop already ships a layered planning architecture:
+### 2. What Must NOT Happen — No Hardcoded Instantiation
-| Component | Location | Role |
-|---|---|---|
-| `generatePlan` | `src/subagents/planner.ts` | LLM-powered decomposition of a goal into `PlanStep[]` |
-| `refinePlan` | `src/subagents/planner.ts` | Corrects a plan that references unknown tools |
-| `validatePlan` | `src/subagents/planner.ts` | Checks that all tool names in the plan are registered |
-| `executePlan` | `src/orchestrator.ts` | Runs steps in sequence; supports `retry/skip/abort` on failure and checkpoint/resume |
-| `plan` tool | `src/tools/plan.ts` | Exposes plan generation to the agent loop as a callable tool |
-| `plan-and-run` tool | `src/tools/plan-and-run.ts` | Combines generation + execution in one tool call |
+The transformation must not be done by pre-wiring cmake commands (or any other build-system
+commands) into static configuration files. A hardcoded solution:
-The planner runs as a **tool-free subagent**: it receives workspace context and a list of available tools, then returns a JSON plan. The orchestrator dispatches each step to a `runSubagent` call (or `SubagentManager` for complex steps) with an iteration budget derived from `estimatedComplexity`.
+- Cannot adapt when a project changes its build system or adds presets.
+- Does not scale across projects or workspaces.
+- Defeats the purpose of a coding agent that is supposed to _reason_ about its environment.
-**Key insight already present in agentloop**: each `PlanStep` carries an optional `agentProfile` field. The orchestrator activates the named profile (model, temperature, tool subset) for that step. This enables per-step specialisation without a separate orchestration framework.
+The transformation must be agent-driven and workspace-aware at runtime.
---
-### 3. What Other Frameworks Do
-
-#### 3.1 LangGraph (LangChain)
-- Models agent behaviour as a **directed graph** of nodes (LLM calls, tool calls) with conditional edges.
-- Supports cycles (retry loops), parallel fan-out/fan-in, and human-in-the-loop interrupts.
-- Templates are *graph patterns* stored as reusable subgraphs.
-- Complexity: full graph authoring required for every new workflow shape.
-
-#### 3.2 AutoGen (Microsoft)
-- Multi-agent conversation: a **Planner** agent, a **Coder** agent, and an **Executor** agent exchange messages until the task is done.
-- Task decomposition happens in natural language — the Planner emits step descriptions that the Coder implements.
-- Workflow templates are **system prompts** for each role, often provided in a configuration YAML.
-- Strength: easy to add domain-expert agents. Weakness: conversation history grows rapidly; quality depends on message-passing discipline.
-
-#### 3.3 CrewAI
-- Defines **Crew** (team of agents), **Agents** (role + backstory + tools), and **Tasks** (description + expected output + dependencies).
-- Supports sequential and hierarchical execution; tasks can pass their output as context to dependent tasks.
-- Workflow templates are *Crew + Task YAML configurations* that can be parameterised and re-instantiated for different inputs.
-- Strong alignment with the "workflow template" concept in this issue: a `BuildVerifyCrew` YAML is a reusable template instantiated per workspace.
-
-#### 3.4 OpenAI Assistants + Structured Outputs
-- Persistent thread context allows multi-turn tasks without re-injecting history.
-- `run_step` objects provide a built-in audit trail of each tool call and its output.
-- Templates are **Assistant instructions** (system prompt) combined with few-shot examples in the thread.
-- Limitation: tied to the OpenAI API; no local model support.
-
-#### 3.5 Copilot Coding Agent (GitHub Copilot)
-The example in this issue shows a subtask with:
-```json
-{
- "name": "build-verify",
- "agent_type": "task",
- "description": "Build the plugin to verify changes",
- "prompt": "...\nSteps:\n1. Run: sudo bash scripts/install-linux-deps.sh\n2. Run: git submodule update --init --recursive\n3. Run: cmake --preset linux-release\n4. Run: cmake --build --preset linux-build -j2\n\nReport whether the build succeeded or failed..."
-}
+### 3. Current agentloop Machinery and the Transformation Points
+
+agentloop already has the primitives for this transformation. The pipeline is:
+
+```
+Generic intent: "verify-build"
+ │
+ ▼
+[1] analyzeWorkspace() ← detects build system, extracts lifecycle commands
+ │ WorkspaceInfo { language="cmake", buildCommand="cmake --preset…", … }
+ ▼
+[2] generatePlan() / planner ← LLM reasons: intent + workspace → concrete PlanStep[]
+ │ PlanStep { description="Run cmake --preset linux-release …",
+ │ toolsNeeded=["shell"], agentProfile="build-verify" }
+ ▼
+[3] executePlan() / orchestrator ← activates per-step agent profile, runs subagent
+ │ build-verify profile → shell tool, low temperature, build-verify skill
+ ▼
+[4] StepResult { output="…compiler output…", status="success"|"failed" }
```
-Key observations:
-- The template name (`build-verify`) is **stable and reusable** across tasks.
-- The concrete steps (cmake preset names, script paths) are **workspace-specific** and were derived from workspace knowledge.
-- Instantiation happens once, at workspace-setup time — not re-derived on every task.
-- This is equivalent to agentloop's `agentProfile` + `skill` combination, but the steps are baked into the prompt string rather than generated by a planner.
+#### Step [1] — `analyzeWorkspace()` in `src/workspace.ts`
----
+This is the **workspace probe**. It inspects the repository root for indicator files
+(`CMakeLists.txt`, `Cargo.toml`, `package.json`, `build.gradle`, `pom.xml`, etc.) and extracts
+the concrete lifecycle commands (`buildCommand`, `testCommand`, `lintCommand`). The result is a
+`WorkspaceInfo` object — the source of truth for what commands the workspace actually uses.
-### 4. Template Taxonomy for Coding Agents
+This is the right place for build-system detection: once, per session, before planning.
+
+#### Step [2] — `generatePlan()` / `buildPlannerTask()` in `src/subagents/planner.ts`
+
+The planner LLM receives the generic intent plus the `WorkspaceInfo` context (including
+detected lifecycle commands) and reasons about what concrete steps to produce. This is the
+**intent-to-steps transformation**:
+
+```
+Task: verify-build
+Workspace: language=cmake, packageManager=cmake,
+ build="cmake --preset linux-release && cmake --build --preset linux-build",
+ test="ctest --preset default"
+Available tools: shell, file-read, file-list
+```
-Based on the above analysis, coding workflow templates fall into three categories:
+The LLM returns a plan with concrete step descriptions drawn from the workspace context. No
+static configuration is needed — the planner derives the steps dynamically from what
+`analyzeWorkspace()` found.
-#### Category A — Build Lifecycle Templates
-Fixed structure, workspace-specific commands:
+The planner can also annotate each step with an `agentProfile`, directing the orchestrator to
+activate a specialised agent (e.g. `build-verify`) for that step.
-| Template | Shape | Workspace-specific parts |
-|---|---|---|
-| `build-verify` | configure → compile → report | preset name, script paths, parallelism flag |
-| `clean-build` | clean → configure → compile | build directory, preset/profile |
-| `release-package` | build → test → package → sign | packaging format, signing key |
+#### Step [3] — `executePlan()` / `runStep()` in `src/orchestrator.ts`
-#### Category B — Quality Gate Templates
-Fixed checklist, tool-specific commands:
+The orchestrator executes each step as a `runSubagent` call. When a step carries an
+`agentProfile` annotation, `activateProfile()` loads the profile (tools, model, temperature,
+skills). The agent then has both the **concrete step description** (from the planner) and the
+**domain guidance** (from its skill) to execute reliably.
-| Template | Shape | Workspace-specific parts |
-|---|---|---|
-| `test-and-fix` | run tests → parse failures → locate code → fix → re-run | test runner command, test output format |
-| `lint-and-format` | run linter → parse output → apply fixes → re-verify | linter binary, fix flags |
-| `security-scan` | run scanner → parse findings → generate report | scanner CLI, severity threshold |
+---
-#### Category C — Development Workflow Templates
-Higher-level patterns:
+### 4. Which Interaction Patterns from the Baseline Research Are Essential
-| Template | Shape | Notes |
-|---|---|---|
-| `feature-branch` | branch → implement → test → PR | Uses git tools + planner |
-| `dependency-update` | audit → update → test → commit | Integrates vulnerability check |
-| `hotfix` | branch from tag → apply fix → test → backport | Requires git-log, cherry-pick |
+The baseline branch (`copilot/research-agent-fws`) identified eight gaps in agentloop. For
+intent-to-action transformation, three are directly essential:
---
-### 5. How agentloop Can Implement Workflow Templates
+#### 4.1 Plan-Execute-Verify Loop (Baseline Issue 3) — **Critical**
-agentloop's existing primitives map cleanly onto the template concept:
+**Why it matters for "verify-build"**: The word "verify" in the intent means the agent must
+_confirm_ that the build succeeded — not merely that the build process ran without throwing an
+exception. Today, `executePlan()` marks a step as `status: "success"` as soon as the subagent
+returns without throwing. A build that silently failed (zero exit code but wrong output, a
+`make` that skipped targets, a test that passed vacuously) is indistinguishable from a correct
+build.
-#### 5.1 Templates as Agent Profiles + Skills (recommended)
+**The missing piece**: A `VerificationAgent` (proposed in Issue 3) runs after each step and
+produces a structured `VerificationResult { passed, reasoning, issues[] }`. For a build step,
+the verifier checks: "Does the output contain evidence of a successful compilation? Are there
+error messages? Is the binary present?"
-A workflow template = **agent profile** (what tools, model, iteration budget) + **skill** (domain knowledge, step sequence, error heuristics).
+**Dynamic replanning on failure**: When the verifier flags the build as failed, the system
+calls `refinePlan()` with the verifier's feedback (e.g., "missing dependency X"). The
+orchestrator replaces the remaining steps with a revised plan that installs the dependency and
+retries the build. This is the essential self-correction loop for "verify-build".
-Example — `build-verify` profile (`src/agents/builtin/build-verify.agent.json`):
-```json
-{
- "name": "build-verify",
- "description": "Build verification agent — compiles the workspace and reports success or failure",
- "temperature": 0.1,
- "skills": ["build-verify"],
- "tools": ["shell", "file-read", "file-list"],
- "maxIterations": 10
-}
+**Interaction pattern** (from Issue 3):
```
+executePlan()
+ └─ for each step:
+ ├─ runStep() ← executes the build command
+ ├─ verifyStep() ← checks build output for success/failure
+ │ ├─ pass → next step
+ │ └─ fail → refinePlan(feedback) → re-execute
+ └─ checkpoint.save()
+```
+
+**Without this pattern**, a "verify-build" intent can only execute the build — it cannot
+actually verify the outcome.
-The paired `build-verify` skill (`src/skills/builtin/build-verify.skill.md`) injects:
-- Step sequence (identify build system → install deps → configure → compile → report)
-- Error triage heuristics (linker errors, missing headers, stale cache)
-- Parallelism flags per build tool
+---
-The planner can then annotate a step with `"agentProfile": "build-verify"` and the orchestrator will activate the matching profile for that step — automatically binding the right skill, tool subset, and temperature.
+#### 4.2 Dynamic Task Decomposition (Baseline Issue 4) — **Important**
-#### 5.2 Templates as Planner Context (workspace-aware instantiation)
+**Why it matters**: The "verify-build" intent may require sub-steps that cannot be known at
+planning time. For example:
+- The planner produces a step "run the build"
+- During execution, the build fails with "submodules not initialised"
+- The agent needs to inject a sub-step "git submodule update --init --recursive" _before_
+ retrying the build
-The planner prompt includes `workspaceInfo` fields including the detected lifecycle commands (`buildCommand`, `testCommand`, `lintCommand`). This allows the planner to produce **concrete, workspace-specific steps** in one shot:
+This is addressed by **Dynamic Task Decomposition** (Issue 4, section 4): a complex step can
+call a `decompose_task` tool at runtime to inject new sub-steps immediately after the current
+step. The orchestrator's `executePlan()` maintains a mutable steps list and inserts the new
+steps in-place.
+**Interaction pattern**:
```
-Workspace: language=cmake, packageManager=cmake,
- build="cmake --preset linux-release && cmake --build --preset linux-build",
- test="ctest --preset linux-test"
+executePlan()
+ ├─ steps = [... mutable list ...]
+ └─ step i: "Run build"
+ └─ subagent calls decompose_task({newSteps: [
+ { description: "Init submodules", … },
+ { description: "Re-run build", … }
+ ]})
+ → steps[i+1..] = [init-submodules, re-run-build, ...original-remaining-steps]
```
-The planner output then directly embeds the correct commands rather than using a generic placeholder.
+**Without this pattern**, intent-to-action transformation is only as good as the planner's
+initial plan. When the environment deviates from expectations (missing deps, wrong tool
+version, first-time setup required), the agent has no way to adapt mid-execution.
-#### 5.3 Template Instantiation: Agent vs Static
+---
-| Approach | When to use | Trade-offs |
-|---|---|---|
-| **Planner-time instantiation** (current) | Novel tasks, unknown workspaces | Flexible, adapts to workspace; requires LLM call |
-| **Profile+skill pre-configuration** (new) | Recurring workflows (CI, build-verify) | Fast, deterministic, version-controlled; less adaptive |
-| **Hybrid** (recommended) | Plan overall task, but use pre-defined profiles per step | Best of both worlds |
+#### 4.3 Hierarchical Delegation (Baseline Issue 4) — **Architectural**
-The hybrid approach is already supported: the planner annotates `agentProfile` on steps, and the orchestrator activates the profile. Adding skills that encode the step sequence means the profile-activated agent "knows" the right procedure without the planner having to enumerate every sub-step.
+**Why it matters**: At a higher level of organisation, a _coordinator agent_ can receive the
+"verify-build" intent and delegate workspace analysis and step instantiation to a child agent.
+This is the **Hierarchical pattern** from Issue 4.
+
+**Interaction pattern**:
+```
+Coordinator receives: "verify-build"
+ └─ calls delegate_subagent("workspace-analyst")
+ └─ workspace-analyst: reads workspace, returns WorkspaceInfo + recommended steps
+ └─ coordinator constructs a plan from the returned recommendations
+ └─ calls delegate_subagent("build-verify") with concrete step descriptions
+ └─ build-verify agent: executes build, returns structured result
+ └─ coordinator synthesises final report
+```
+
+This pattern separates concerns cleanly: the coordinator holds the intent, the workspace
+analyst provides grounding, and the build-verify agent executes. Today's planner partially
+plays the coordinator role, but it cannot delegate to a workspace analyst because it is a
+tool-free subagent that only outputs JSON. Hierarchical delegation would allow the planner to
+_actively probe_ the workspace via tool calls rather than relying on pre-computed
+`WorkspaceInfo`.
---
-### 6. Concrete Example: CMake Build-Verify Flow
-
-**Goal**: "Build the plugin to verify changes compile correctly"
-
-**Planner output** (with workspace context `build="cmake --preset linux-release && cmake --build --preset linux-build -j2"`):
-
-```json
-{
- "steps": [
- {
- "description": "Install Linux build dependencies",
- "toolsNeeded": ["shell"],
- "estimatedComplexity": "low",
- "agentProfile": "devops"
- },
- {
- "description": "Update git submodules",
- "toolsNeeded": ["shell"],
- "estimatedComplexity": "low",
- "agentProfile": "devops"
- },
- {
- "description": "Build the project using cmake --preset linux-release && cmake --build --preset linux-build -j2 and report compiler output",
- "toolsNeeded": ["shell"],
- "estimatedComplexity": "medium",
- "agentProfile": "build-verify"
- }
- ]
-}
-```
+#### 4.4 Toolbox Refiner (Baseline Issue 5) — **Supporting**
-The `build-verify` agent profile activates the `build-verify` skill, which provides the step sequence and error triage guidance. The concrete commands come from `workspaceInfo.buildCommand`, injected into the planner prompt.
+**Why it matters**: The build-verify agent only needs `shell`, `file-read`, and `file-list`.
+Exposing all 16+ registered tools dilutes the agent's focus and wastes context budget. The
+**Toolbox Refiner** (Issue 5) dynamically narrows the exposed tool set per invocation based on
+the step's declared `toolsNeeded` list and the task description.
+
+This is already partially addressed by the profile-based `tools[]` list in agent profiles.
+The Toolbox Refiner would make this dynamic (keyword or embedding matching) rather than
+requiring a manually-maintained allowlist per profile.
---
-### 7. Recommendations and Gaps Addressed
+### 5. The Role of Templates in the Dynamic System
+
+Templates (agent profiles + skills) play a supporting role — they are **not** the source of
+concrete steps. Their actual function is:
-| Gap | Solution implemented |
+| Template element | Role |
|---|---|
-| Planner didn't know lifecycle commands | ✅ `buildPlannerTask` now includes `build`, `test`, `lint` commands from `WorkspaceInfo` |
-| Only Node/Python/Go workspace detection | ✅ Added CMake, Rust/Cargo, Gradle, Maven analyzers in `workspace.ts` |
-| No build-workflow agent profile | ✅ `build-verify.agent.json` and `test-runner.agent.json` added |
-| No build-workflow skill | ✅ `build-verify.skill.md` and `cmake-workflow.skill.md` added |
+| Agent profile (`tools`, `temperature`, `maxIterations`) | Shapes the execution environment for a step |
+| Skill (`promptFragment`) | Provides domain guidance to the agent running the step — what to look for, what errors mean, how to report |
+
+The **concrete steps** always come from the planner, which derives them from:
+1. The generic intent ("verify-build")
+2. The workspace context (`WorkspaceInfo` from `analyzeWorkspace()`)
+3. The available agent profiles (the planner can annotate `agentProfile` per step)
+
+A `build-verify` profile + skill gives the executing agent the knowledge to:
+- Identify which build system is in use (from the workspace `language` field)
+- Interpret compiler output (error triage heuristics in the skill)
+- Produce a structured success/failure report
+
+But the specific commands to run come from the workspace analysis, injected into the planner
+context at planning time.
+
+---
+
+### 6. Recommended Interaction Pattern: Full "verify-build" Flow
-### 8. Remaining Open Questions
+Combining the above, the complete agent-driven "verify-build" flow using agentloop components:
-1. **Template registry**: Should templates be discoverable at runtime (e.g. `list-templates` tool) so the planner can reference them by name? The current profile registry partially serves this role.
-2. **Workspace-once vs task-every-time**: For expensive workspace analysis (submodule init, dependency install), should a "workspace setup" template run once at session start and cache results? This aligns with CrewAI's `before_kickoff` hook concept.
-3. **Multi-repo / monorepo**: `analyzeWorkspace` currently detects one build system per root. Monorepos with mixed build systems (e.g. a CMake C++ library + a Node.js frontend) need a recursive scan.
-4. **Template versioning**: When the workspace changes (new preset, renamed script), how are baked templates kept in sync? A solution is to keep commands in `WorkspaceInfo` (auto-detected) rather than hard-coding them in profile prompts.
+```
+User: "verify-build"
+ │
+ ▼
+[A] analyzeWorkspace(rootPath)
+ → WorkspaceInfo { buildCommand="cmake --preset …", language="cmake", … }
+ │
+ ▼
+[B] generatePlan("verify-build", workspaceInfo, registry, profileRegistry)
+ → Plan {
+ steps: [
+ { description: "Run: cmake --preset …",
+ toolsNeeded: ["shell"], agentProfile: "build-verify" },
+ { description: "Report build result",
+ toolsNeeded: [], agentProfile: "build-verify" }
+ ]
+ }
+ │
+ ▼
+[C] executePlan(plan, registry, { verificationEnabled: true, task: "verify-build", workspaceInfo })
+ │
+ ├─ step 0: runStep() → shell("cmake --preset …") → output
+ │ verifyStep() → VerificationResult { passed, reasoning, issues }
+ │ └─ fail? → refinePlan(feedback) → re-execute
+ │
+ └─ step 1: runStep() → agent synthesises report from step 0 output
+ verifyStep() → confirm report contains success/failure conclusion
+ │
+ ▼
+ExecutionResult { stepResults, success, verificationResults }
+```
+
+The key properties of this flow:
+- **Generic intent, concrete execution**: "verify-build" is never mapped to cmake commands in
+ config — the planner derives them from workspace analysis.
+- **Self-correcting**: the PEV loop (Issue 3) catches silent failures and replans.
+- **Extensible**: adding support for a new build system requires only updating
+ `analyzeWorkspace()` — no profile or template changes needed.
+- **Composable**: the same flow applies to "run-tests", "lint", or any other lifecycle intent.
+
+---
+
+### 7. Gap Summary Relative to Baseline Research
+
+| Baseline Issue | Pattern | Essential for "verify-build"? | Current status |
+|---|---|---|---|
+| Issue 3 | Plan-Execute-Verify loop | ✅ Critical — without it, "verify" is just "run" | ❌ Not yet implemented |
+| Issue 3 | Dynamic replanning on verification failure | ✅ Critical — enables self-correction | ❌ Not yet implemented |
+| Issue 4 | Dynamic task decomposition | ✅ Important — handles mid-execution surprises | ❌ Not yet implemented |
+| Issue 4 | Hierarchical delegation | 🔶 Architectural — enables active workspace probing | ❌ Not yet implemented |
+| Issue 5 | Toolbox Refiner | 🔶 Supporting — reduces noise in build agent | ❌ Not yet implemented |
+| Issue 2 | Persistent memory | 🔶 Optional — cache workspace analysis across sessions | ❌ Not yet implemented |
+
+### 8. What Has Been Improved in This PR
+
+| Change | Effect |
+|---|---|
+| `analyzeWorkspace()` now detects CMake, Cargo, Gradle, Maven | Workspace analysis returns concrete lifecycle commands for more build systems |
+| `buildPlannerTask()` includes `buildCommand`/`testCommand`/`lintCommand` | Planner receives concrete command strings → produces workspace-specific plan steps without hardcoding |
+| `build-verify` and `test-runner` agent profiles | Execution environment for build/test steps — define which tools and temperature are appropriate |
+| `build-verify` skill | Domain guidance injected into the build agent — how to identify the build system, interpret output, triage errors |
+
+These improvements advance Step [1] (workspace analysis) and Step [2] (planner context) of the
+transformation pipeline. Steps [3] and [4] (verification and dynamic replanning) require the
+Plan-Execute-Verify implementation from Issue 3 to be complete.
diff --git a/src/__tests__/builtin-skills.test.ts b/src/__tests__/builtin-skills.test.ts
index b2107d10..3c96455d 100644
--- a/src/__tests__/builtin-skills.test.ts
+++ b/src/__tests__/builtin-skills.test.ts
@@ -21,10 +21,9 @@ describe("built-in skill library", () => {
"git-workflow",
"security-auditor",
"build-verify",
- "cmake-workflow",
];
- it("loads all 7 built-in skills", () => {
+ it("loads all 6 built-in skills", () => {
const names = registry.list().map((s) => s.name);
for (const name of BUILTIN_NAMES) {
expect(names).toContain(name);
diff --git a/src/__tests__/fixtures/workspace-cargo/tests/.keep b/src/__tests__/fixtures/workspace-cargo/tests/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-cmake/tests/.keep b/src/__tests__/fixtures/workspace-cmake/tests/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-gradle-kotlin/src/test/.keep b/src/__tests__/fixtures/workspace-gradle-kotlin/src/test/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-gradle/src/test/.keep b/src/__tests__/fixtures/workspace-gradle/src/test/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/src/__tests__/fixtures/workspace-maven/src/test/.keep b/src/__tests__/fixtures/workspace-maven/src/test/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/src/skills/builtin/cmake-workflow.skill.md b/src/skills/builtin/cmake-workflow.skill.md
deleted file mode 100644
index 856f0366..00000000
--- a/src/skills/builtin/cmake-workflow.skill.md
+++ /dev/null
@@ -1,82 +0,0 @@
----
-name: cmake-workflow
-description: CMake-specific build, test, and packaging patterns including preset-based workflows
-version: 1.0.0
-slot: section
----
-
-## CMake Workflow Guidelines
-
-### Project layout conventions
-
-- Source lives in `src/`; headers in `include/`; tests in `tests/` or `test/`.
-- Out-of-source builds go in `build/` (excluded from version control via `.gitignore`).
-- `CMakeLists.txt` at the repository root is the entry point; each subdirectory may have its own `CMakeLists.txt`.
-
-### Preset-based workflow (preferred when `CMakePresets.json` exists)
-
-```bash
-# Configure
-cmake --preset # e.g. linux-release, debug, ci
-
-# Build
-cmake --build --preset [--parallel $(nproc)]
-
-# Test
-ctest --preset [--output-on-failure]
-```
-
-List available presets:
-```bash
-cmake --list-presets # configure presets
-cmake --build --list-presets # build presets
-ctest --list-presets # test presets
-```
-
-### Classic out-of-source workflow (no presets)
-
-```bash
-# Configure (Release build, Ninja generator recommended)
-cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Release
-
-# Build (parallel)
-cmake --build build --parallel $(nproc)
-
-# Test
-cd build && ctest --output-on-failure
-```
-
-### Dependency management
-
-- **Submodules**: always run `git submodule update --init --recursive` before configuring.
-- **find_package**: ensure system libraries are installed (e.g. `sudo apt install libssl-dev`).
-- **FetchContent / CPM.cmake**: dependencies are downloaded during configure; verify internet access or a local cache is available.
-- **vcpkg / Conan**: run `vcpkg install` or `conan install .` before `cmake -S . -B build`.
-
-### Install-step dependencies pattern
-
-When a project ships a dependency-installation script (e.g. `scripts/install-linux-deps.sh`), run it *before* the CMake configure step:
-
-```bash
-sudo bash scripts/install-linux-deps.sh
-git submodule update --init --recursive
-cmake --preset
-cmake --build --preset --parallel $(nproc)
-```
-
-### Common CMake variables
-
-| Variable | Purpose |
-|---|---|
-| `CMAKE_BUILD_TYPE` | `Debug`, `Release`, `RelWithDebInfo`, `MinSizeRel` |
-| `CMAKE_INSTALL_PREFIX` | Install destination (default `/usr/local`) |
-| `CMAKE_TOOLCHAIN_FILE` | Cross-compile or vcpkg toolchain |
-| `BUILD_SHARED_LIBS` | `ON` to build shared libraries by default |
-| `CMAKE_EXPORT_COMPILE_COMMANDS` | `ON` to generate `compile_commands.json` for tooling |
-
-### Diagnosing build failures
-
-1. Check the **configure step** output first — missing dependencies abort here.
-2. Look for the **first** error in compiler output; subsequent errors are often cascading.
-3. Enable verbose output to see exact compiler flags: `cmake --build build --verbose` or `VERBOSE=1 make`.
-4. Use `--fresh` flag to force a clean reconfigure: `cmake --fresh --preset `.