Skip to content

RFC: self-improving skills via hindsight-skill-builder meta-skill#960

Open
nicoloboschi wants to merge 13 commits intomainfrom
rfc/self-improving-skills
Open

RFC: self-improving skills via hindsight-skill-builder meta-skill#960
nicoloboschi wants to merge 13 commits intomainfrom
rfc/self-improving-skills

Conversation

@nicoloboschi
Copy link
Copy Markdown
Collaborator

Summary

  • Adds skills/hindsight-skill-builder/SKILL.md — a conversational meta-skill that builds Hindsight-backed self-improving skills through a 4-step approval-gated flow. Lives next to the existing hindsight-local / hindsight-cloud / etc siblings and installs the same way.
  • Adds two end-to-end demos under hindsight-dev/demos/ proving the pattern works on stock Hindsight with no schema changes and no Git integration.

The pattern

A "self-improving skill" is a thin SKILL.md whose body is curl + fallback-to-asking. The curl fetches live instructions from a Hindsight mental model, which auto-refreshes after each consolidation cycle. The skill file itself never changes — the mental model content evolves from user feedback during normal conversation, via the existing retain → consolidate → refresh pipeline.

The meta-skill walks the agent through the setup conversationally with explicit user approval at every decision (Hindsight URL, bank name, missions, mental model spec, target path).

Two demos included

  • hindsight-dev/demos/hermes-marketing/ — V1, hand-crafted bank + missions + manual API calls, to prove the pattern mechanically.
  • hindsight-dev/demos/hermes-marketing-v2/ — V2, same end-to-end run but setup happens entirely through the new meta-skill. One user sentence + four approvals = full skill configured, then 7 turns of marketing-post feedback evolves the mental model from empty to a ~5000-char structured style guide. The SKILL.md file's MD5 is identical at start and end of the run — only the curled mental model content changes.

Known bugs the demos hit (not fixed in this PR)

Documented in each demo's README, ranked by severity:

  1. hindsight_api/config.py:18load_dotenv(find_dotenv(usecwd=True), override=True) walks upward from the daemon's cwd and silently overrides explicit env vars with whatever .env file it finds. Any sibling worktree with a dev .env breaks daemon startup. Cost ~30min to diagnose. Fix: use override=False or scope the search to ~/.hindsight/.
  2. consolidator.py:_trigger_mental_model_refreshes tag gating — mental models with tags never auto-refresh when consolidated memories are untagged (the default for hermes auto-retain). The failure is silent. Fix: at minimum log a warning when an auto-refresh MM is skipped due to tag mismatch; better, reconsider the security model.
  3. hermes plugin bank_mission / bank_retain_mission dead fields — the plugin reads these from ~/.hermes/hindsight/config.json but never sends them to the Hindsight API. Fix: call update_bank_config on init.
  4. Port confusionhindsight-docs/docs-integrations/hermes.md:107 says apiPort: 9077, the plugin source defaults to 8888, the hermes profile registers as :9177. Three numbers in three places.

Test plan

  • Pull this branch, review skills/hindsight-skill-builder/SKILL.md
  • Read hindsight-dev/demos/hermes-marketing-v2/README.md for the full walkthrough
  • Reproduce the V2 demo locally by installing the skill into a harness (instructions in the comment below)
  • Review agent-authored missions/source_query quality against the V1 hand-crafted versions
  • Flag any rough edges in the meta-skill instructions, especially the approval-protocol tone

Not in scope for this PR

  • Push-model alternative (Hindsight rewriting SKILL.md directly instead of curl at runtime) — follow-up RFC
  • Skill registry / catalog
  • Fixes for the four bugs above
  • Harnesses other than hermes (pattern should work for claude-code/codex/cursor; untested)

…on dep

local-llm (llama-cpp-python) requires C++ compilation and is only needed
for the built-in llamacpp provider. Keep it as a separate opt-in:
pip install 'hindsight-api-slim[local-llm]'
Allows: pip install 'hindsight-all[local-llm]' to get built-in llamacpp support.
A conversational meta-skill that walks an agent through creating a new
Hindsight-backed self-improving skill: locate the Hindsight instance,
select or create a bank with tuned retain/observations/reflect missions,
create a binding mental model with refresh_after_consolidation, and write
the new SKILL.md into the harness skills directory. Human-in-the-loop
approval is required at every decision. Includes a worked example,
mission-writing principles, counter-examples, and inline warnings for the
known tag-gated auto-refresh and cwd-contaminated daemon startup gotchas.

Installed the same way as the existing hindsight-local / hindsight-cloud
siblings. Works on any harness that loads SKILL.md files.
Two runs of the same marketing-writer demo to prove the self-improving
skill pattern works on stock Hindsight today:

- hermes-marketing/   V1, manual setup (hand-crafted bank missions, manual
                      API calls, hermes authored the SKILL.md on request)
- hermes-marketing-v2/ V2, setup via the new hindsight-skill-builder
                      meta-skill (one user sentence + four approvals)

Both runs follow the same 9-turn marketing conversation and produce a
mental model that grows from empty to ~5000 chars of structured style
guide. The SKILL.md file itself never changes — MD5 verified. Rules
extracted across four feedback batches (tone/audience/length, signature
placement/metric, no-just/milestone-pattern/community-actions, date-format/
speaker-names/digits) all make it into the final mental model content.

README files in each dir document the bugs hit during the runs
(find_dotenv cwd contamination, tag-gated auto-refresh, dead bank
missions in hermes plugin config).
Step 1 now treats cloud, self-hosted, and embedded as first-class
deployment modes. The meta-skill:

- probes ALL candidates (env vars, ~/.hindsight/config, ~/.hermes,
  ~/.claude, ~/.hindsight/codex.json, well-known local ports, cloud
  default) instead of stopping at the first hit
- performs authenticated /health checks when an API key is available
- classifies each candidate as embedded / self-hosted / cloud from the
  hostname and auth status
- proposes every healthy candidate and asks the user to choose when
  more than one is available (never silently prefers)
- never prints the API key value

Steps 2 and 3 now thread an `$AUTH` header through every API call so
bank and mental-model operations work against cloud and authenticated
self-hosted instances, not just the embedded daemon.

Step 4 ships two SKILL.md templates:

- Template A for embedded / no-auth instances (plain curl)
- Template B for cloud / self-hosted with auth, which reads
  HINDSIGHT_API_KEY from the environment at runtime, fails loudly
  when the env var is missing, and treats 401/403 as an auth failure
  rather than an empty mental model

Added an explicit rule that the generated SKILL.md must reference
$HINDSIGHT_API_KEY as a shell variable — never the literal key value —
and a counter-example documenting the hardcoded-key failure mode.

New gotchas section entries cover auth secrets, the cloud tenant path
segment, and mixed-mode deployments where the user has both a local
daemon and a cloud instance.

Smoke-tested: with HINDSIGHT_API_KEY set, the meta-skill probes
5 candidates, finds two healthy (localhost:9177 + cloud), classifies
each correctly, and asks the user to choose.
…body

Template A (embedded) and Template B (cloud/self-hosted) were
functionally identical except for the curl line. Replaced both with a
single SKILL.md template whose curl uses bash parameter expansion
(\${HINDSIGHT_API_KEY:+...}) to auto-include the Authorization header
only when the env var is set. The same rendered file is correct for
embedded (no key), self-hosted (optional key), and cloud (required key)
— no branching at template-render time.

Error handling is now unified: the curl captures the HTTP status code
with -w '%{http_code}' and the skill body treats any non-2xx as a hard
failure (stop, report, do not fall through to empty-guidance handling).
401/403/404/5xx each get a specific user-facing message but the
skeleton is one path.

Net result: 448 lines → 411 lines, one workflow, one learning-loop
section, one thing to maintain.
@fabioscarsi
Copy link
Copy Markdown
Contributor

Smart! Interesting!

The meta-skill and every generated skill now use the hindsight CLI,
not raw curl. The runtime of a generated skill is two lines:

    source ~/.hindsight/learning-skill.env
    hindsight mental-model get <bank> <mm_id> --output json

No more ${HINDSIGHT_API_KEY:+...} parameter expansion, no more manual
HTTP status code capture, no more 401/403/404 branching — the CLI
handles URL resolution, auth, tenant path, and error reporting. The
CLI exits non-zero with a clean message on failure, which the skill
body treats as a hard error (distinct from "empty guidance").

Key storage: ~/.hindsight/learning-skill.env, chmod 600, sourceable.
Single file per installation, written once by the meta-skill during
Step 1. Contains HINDSIGHT_API_URL and HINDSIGHT_API_KEY (empty for
embedded mode). The generated skills and the meta-skill itself both
source this file — it is the one and only source of truth for
Hindsight connection details. No config file chains, no env-var
fallback search at runtime.

Step 1 now has three sub-steps:
1. Probe + classify candidates (unchanged — URL discovery still works)
2. Verify `hindsight` CLI is installed; install it via get-cli script
   with the user's explicit approval if missing
3. Write the shared env file at ~/.hindsight/learning-skill.env with
   chmod 600, verify with `hindsight health`

Steps 2 and 3 are rewritten to use CLI commands:
- `hindsight bank list --output json`
- `hindsight bank config <id> --output json`
- `hindsight bank create <id> --name ...`
- `hindsight bank set-config <id> --retain-mission ... --observations-mission ... --reflect-mission ...`
- `hindsight mental-model create <bank> <name> <source_query> --id <id>`
- `hindsight mental-model get <bank> <id> --output json`

CLI gap (to be fixed upstream): `hindsight mental-model create` and
`hindsight mental-model update` do not currently support
refresh_after_consolidation or tags. Step 3 falls back to one raw
curl PATCH immediately after the CLI create to set the trigger and
empty tags. This is the ONLY raw curl call in the entire flow and is
explicitly marked as temporary — remove when the CLI is fixed.

Step 4 template is dramatically simpler: no HTTP code capture, no
auth header logic, just source + CLI call + JSON parse + error
handling + fallback.

Counter-examples updated: the "bad auth handling" entry now targets
"embedding the literal key in the skill file" (still applies) and a
new entry targets "reverting to raw curl for simplicity" (don't —
the CLI exists for a reason).

Gotchas updated: added "CLI required", "CLI gap: mental-model
trigger" documenting the PATCH fallback, "env file is single source
of truth", and "env file permissions".

Smoke test (end to end, clean state):
1. Deleted prior state, verified fresh daemon and empty bank list
2. Asked the meta-skill to install a customer-support skill
3. Meta-skill probed URLs, verified CLI present, proposed writing
   env file, asked for approval, wrote with chmod 600
4. Proposed a new customer-support bank with three quality missions
   (specific categories, negative constraints, conflict resolution,
   grouping axes); approved
5. Ran hindsight bank create + set-config; verified three missions
   persisted via hindsight bank config --output json
6. Proposed mental model spec with tags=none citing the refresh
   trigger gotcha; approved
7. Ran hindsight mental-model create, then curl PATCH fallback, then
   hindsight mental-model get to verify trigger and tags
8. Proposed description + target path + rendered SKILL.md body;
   approved; wrote the file
9. Fired one support request against the generated skill: it sourced
   the env file, called hindsight mental-model get, saw the empty MM,
   and correctly fell back to asking the user for support-specific
   direction (tone, refund stance, escalation level, length)

All six validation points passed. Agent-authored missions quality is
as good as or better than hand-crafted ones, same as before.
Step 1 no longer searches for existing Hindsight instances across
~/.hindsight/config, harness config files, or well-known local ports.
It asks the user which of three deployment modes they want (cloud /
embedded / self-hosted API) and sets up only that mode. Explicit,
predictable, and gives the user control over which instance the new
skill will bind to.

The only auto-detection allowed:
1. The meta-skill's own state file at ~/.hindsight/learning-skill.env
   (memory of a previous run, not user config)
2. HINDSIGHT_API_KEY env var — ONLY for Cloud mode, as a convenience
   fallback. Self-hosted and embedded modes must have values entered
   by the user explicitly.

Mode 1 (Cloud): URL is fixed at https://api.hindsight.vectorize.io.
Prompts for API key (env fallback allowed here), verifies with
`hindsight health`.

Mode 2 (Embedded): spins up a dedicated hindsight-embed profile named
`learning-skill` (isolated from any existing hermes / claude-code /
codex profile), prompts for LLM provider + key, runs
`hindsight-embed profile create --env ...` and
`hindsight-embed -p learning-skill daemon start`, reads the allocated
port from `profile show --output json`, verifies with hindsight health.

Mode 3 (Self-hosted API): prompts for URL and optional API key,
explicitly forbids reading from env vars or config files. Verifies
with hindsight health.

Gotchas section updated:
- "No auto-discovery" replaces the old "mixed-mode deployments" entry
- "Embedded profile is dedicated" documents the learning-skill profile
  isolation
- Removed the "Hindsight URLs are not always localhost:8888" entry,
  which was specific to the probe-based flow

Smoke tested end-to-end:
1. Deleted ~/.hindsight/learning-skill.env and any learning-skill
   embedded profile for a fresh state
2. Asked meta-skill to install an api-docs skill; agent went straight
   to the 3-mode wizard prompt with no probing
3. Picked embedded + gemini; agent proposed port 9188 (after port
   probe), created the learning-skill profile, started the daemon
   (took ~2 minutes on first run due to uv package install), verified
   health
4. Approved env file write; ~/.hindsight/learning-skill.env has
   URL=http://127.0.0.1:9188 and empty key, chmod 600
5. Verified end state: dedicated daemon on :9188 running alongside
   hermes daemon on :9177 with no conflict, no existing config files
   were modified, hindsight health returns healthy

Ergonomic note for follow-up: ~/.hindsight/profiles/learning-skill.env
(hindsight-embed profile) and ~/.hindsight/learning-skill.env (shared
skill env) have very similar names and could be confused. Consider
renaming one in a follow-up commit.
…ssion

Previously the setup had two separate approval gates: one for the
bank + three missions, a second for the mental model. That's now a
single Step 2 with ONE combined approval block covering the bank
decision, both missions (when creating a new bank), and the full
mental model spec. After the user approves, the meta-skill executes
every CLI command and the trigger PATCH without further pauses.

Also drop reflect_mission entirely. The mental model's source_query
is enough to steer the reflect output for this pattern, and the
default reflect framing is fine. Bank config is now retain_mission +
observations_mission only. Fewer things to write, fewer things for
the user to review, same behavior.

- Step 2 renamed to "Design and create the bank + mental model (one
  approval)" with an explicit "do not split into two prompts" rule
- Step 3 ("Create the binding mental model") is gone; its content
  lives inside the new Step 2
- What was Step 4 ("Write the SKILL.md file") is renumbered to Step 3
- Cross-references throughout (gotchas, counter-examples) updated to
  point at Step 2 for the curl PATCH fallback and Step 3 for SKILL.md
- Human-in-the-loop protocol section rewritten to list the new
  decision points: wizard choice, mode setup, env file write,
  combined bank+MM (ONE gate), skill file
- Worked example updated to remove the marketing reflect_mission line
- `hindsight bank set-config` no longer passes --reflect-mission

Smoke tested against Hindsight Cloud (the test env happened to have
HINDSIGHT_API_KEY set in the shell, so the wizard's one auto-fallback
for cloud mode kicked in automatically). Verified end state:

- api-documentation bank exists with retain_mission and
  observations_mission set, reflect_mission is None
- api-documentation-writer mental model exists with
  refresh_after_consolidation: true and empty tags
- ~/.hermes/skills/api-documentation-writer/SKILL.md written

Note on smoke-test UX: hermes's built-in `clarify` tool times out
after 120s in -Q non-interactive mode (it's designed for human
input), so the "one approval" gate can't be fully exercised via
scripted turns. In a real interactive hermes session the combined
block would display and wait for user input. The meta-skill
instructions are correct; the non-interactive harness can't model
them cleanly, but the end state is identical either way.
…ew banks

This is the architectural fix for a critical flaw: previous versions
of the meta-skill created a new dedicated bank per skill, but harness
auto-retain hooks only write to ONE bank per session (the one
configured in the plugin). Any dedicated bank the meta-skill created
would never receive user feedback, leaving the mental model empty
forever and the learning loop broken.

The fix flips the architecture: the meta-skill now READS the harness
plugin's config to extract URL, API key, and bank id, and binds every
new mental model to that bank. Every self-improving skill on the
machine shares the harness plugin's bank, scoped only by its own
mental model source_query.

Step 1 — Read harness plugin config
- Detect installed plugin in this order: hermes, claude-code, codex,
  openclaw
- Read URL, bank_id, key from the plugin's config file
- For hermes local_embedded: derive URL from `hindsight-embed -p hermes
  profile show` or default localhost:9177
- For openclaw: secrets are vault-encrypted, fall back to asking the
  user manually after detecting the plugin directory
- If no plugin detected, stop and tell user to install one first
- Verify with `hindsight health`
- Write env file with URL, KEY, BANK_ID (chmod 600)

Step 2 — Create the mental model only
- No bank create, no set-config, no missions (retain / observations /
  reflect are all gone)
- Single approval for the mental model spec
- Run `hindsight mental-model create "$HINDSIGHT_BANK_ID" ...`
- curl PATCH for refresh trigger (CLI gap, unchanged)
- Verify

Step 3 — Write the SKILL.md (renumbered from old Step 4)
- Generated skill uses `"$HINDSIGHT_BANK_ID"` from the sourced env file
- If the harness plugin's bank changes later, re-running the
  meta-skill regenerates the env file and ALL existing generated
  skills follow automatically

Removed:
- The wizard prompt (cloud / embedded / self-hosted) — the harness
  plugin already made that choice
- Wizard branches: Mode 1 / Mode 2 / Mode 3
- The "create a new bank" path entirely
- Bank mission design guidance (retain_mission, observations_mission)
- The dedicated `learning-skill` embedded profile (not needed, the
  harness's daemon is the source)
- "No auto-discovery" gotcha (replaced with "Harness plugin must be
  installed first")
- "Embedded profile is dedicated" gotcha (no longer applicable)

Updated:
- Worked example: dropped the retain/observations missions, the bank
  is whatever the plugin uses
- Counter-examples: dropped bad-retain-mission and
  bad-observations-mission (we don't author them anymore); added
  "bad bank creation" explaining why the previous architecture broke
- Gotchas: added "harness plugin must be installed first", "one bank
  per harness shared across skills", "openclaw secrets are
  vault-encrypted"
- Human-in-the-loop protocol: 5 decision points → 4 (no more wizard
  choice or mode setup)
- Worked example narrative: explains the previous tuned-bank
  prototype as a known-broken approach we abandoned

Smoke test (clean state):
1. Deleted ~/.hindsight/learning-skill.env, set hermes config to
   bank_id=hermes (a bank that exists on the daemon)
2. Started hermes daemon at :9177
3. Asked meta-skill to install an x-post-writer skill
4. Step 1: agent ran a python one-liner to detect harness plugin
   configs in order, found hermes config first, parsed JSON,
   recognized mode=local_embedded → URL=localhost:9177, extracted
   bank_id=hermes, verified with `hindsight health` → healthy,
   paused for approval
5. User approved → wrote env file with URL, KEY (empty), BANK_ID
   (hermes), chmod 600
6. Step 2: agent proposed MM spec with bank=hermes (locked from env
   file), id=x-post-writer, source_query, refresh+no tags. Approved.
   Ran `hindsight mental-model create hermes "X Post Writer
   Guidelines" "..." --id x-post-writer`, then curl PATCH for
   trigger, then verified with `hindsight mental-model get hermes
   x-post-writer --output json`
7. Verified end state: MM exists on the `hermes` bank,
   refresh_after_consolidation=true, tags=None, no new bank was
   created (still just `customer-support` and `hermes`)
8. Step 3: rendered SKILL.md uses `"$HINDSIGHT_BANK_ID"` from the
   sourced env file (not a hardcoded literal), so future harness
   bank changes propagate automatically

Net file size: 536 → 457 lines (-79). Less code, fewer concepts,
correct architecture.
The demos were useful as working evidence during development but they
don't belong on the rfc branch — they're a mix of raw JSON responses,
snapshots, and transcripts that are better kept locally (or in a
separate PR) than shipped as part of the meta-skill release.

The meta-skill itself (skills/hindsight-skill-builder/SKILL.md) is
unchanged by this commit.
Introduces a Python CLI (`scripts/hindsight-agent`) that provisions a
hermes profile as a self-learning agent in one command:

    hindsight-agent install marketing

The CLI runs a wizard (cloud / embedded / self-hosted), creates a
hermes profile cloned from the user's active profile, writes a
Hindsight plugin config.json inside the new profile, copies the
hindsight-skill-builder meta-skill + the template's seed skill into
the profile's skills dir, and drives hermes through the seed skill
to install each self-learning skill declared in agent.yaml.

Template layout (under `agents/<template>/`):
- agent.yaml: profile name, list of self-learning skills with fixed
  id / description / source_query per skill
- seed.md: SKILL.md body the CLI drops into the profile as
  `<template>-agent-seed`. Tells hermes exactly which self-learning
  skills to install with the skill-builder and forbids re-designing
  the parameters.
- skills/ (optional): static SKILL.md files to ship alongside.

V1 ships `agents/marketing/` with one self-learning skill
(marketing-post-writer) that evolves from user feedback via the
meta-skill we already built.

Two meta-skill fixes required to make this work with named hermes
profiles (both hit during the smoke test):

1. Step 1 now resolves the hermes plugin config via `$HERMES_HOME`
   instead of the hardcoded `~/.hermes/hindsight/config.json`. Hermes
   exports `HERMES_HOME=~/.hermes/profiles/<name>` when running under
   a named profile, and the meta-skill was reading the wrong file
   (defaulting to the main profile) until this was threaded through.

2. Step 2 now explicitly calls `hindsight bank create
   "$HINDSIGHT_BANK_ID" --name ...` before `mental-model create`. On
   a fresh profile the target bank doesn't exist yet (no retain has
   happened), so `mental-model create` used to fail and the agent
   would silently bind the skill to a different bank that did exist.
   Added a hard rule against that fallback.

3. Step 3 (skill file location) now resolves the target skills dir
   via `$HERMES_HOME/skills/` so a named profile's skill lands in
   its own skills dir, not the default profile's.

Smoke test (clean slate, run three times to shake out the bugs
above, all three passing on the final run):

- `printf 'smoke-marketing\n2\n3\ny\ny\n' | hindsight-agent install marketing`
- Wizard accepts the defaults (embedded / gemini / use env key)
- Creates hermes profile `smoke-marketing` cloned from active
- Writes ~/.hermes/profiles/smoke-marketing/hindsight/config.json
  with bank_id=smoke-marketing, llm_provider=gemini, real key
- Copies hindsight-skill-builder and marketing-agent-seed skills
- Runs `hermes --profile smoke-marketing chat -Q --yolo --skills
  marketing-agent-seed,hindsight-skill-builder -q "Set up this
  agent..."` which drives the skill-builder end-to-end
- End state:
  - Bank `smoke-marketing` created on the daemon (first time the
    `bank create` fallback mattered)
  - Mental model `marketing-post-writer` on bank `smoke-marketing`
    with refresh_after_consolidation=true and empty tags
  - Env file at ~/.hindsight/learning-skill.env with HINDSIGHT_BANK_ID
    pointing at smoke-marketing, chmod 600
  - SKILL.md at ~/.hermes/profiles/smoke-marketing/skills/
    marketing-post-writer/SKILL.md (profile-specific path, NOT the
    default profile's skills dir)
- Acceptance: `hermes --profile smoke-marketing chat -q "Write me a
  LinkedIn post announcing we just hired our first Head of Data"`
  triggered the marketing-post-writer skill, which sourced the env
  file, called `hindsight mental-model get`, saw the empty content,
  and correctly asked the user for tone/style before drafting.

Known PoC caveats:
- Non-idempotent: re-running install on an existing profile doesn't
  clean up properly. Delete and retry for now.
- No openclaw support yet — the CLI only knows how to drive hermes.
- Only one template (marketing) with one self-learning skill.
- The seed skill uses hermes's clarify tool in non-interactive mode,
  which times out after 120s per gate; the agent auto-decides and
  moves on. Adds ~60-120s to the run but doesn't affect correctness.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants