Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"command": "scripts/hooks/pr-review-gate.sh"
}
]
}
}
14 changes: 8 additions & 6 deletions .github/workflows/amber-issue-handler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -104,9 +104,10 @@ jobs:
---
🤖 [Ambient Session]($PLATFORM_HOST/projects/$AGENTIC_SESSION_NAMESPACE/sessions/$AGENTIC_SESSION_NAME)
6. Add the `ambient-code:managed` label to the PR.
7. Ensure CI passes. If it fails, investigate and fix.
8. Do not merge. Leave the PR open for human review.
9. When you comment on the PR, include this footer at the end:
7. Add the `ambient-code:self-reviewed` label to the PR.
8. Ensure CI passes. If it fails, investigate and fix.
9. Do not merge. Leave the PR open for human review.
10. When you comment on the PR, include this footer at the end:
Comment on lines +107 to +110
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The new self-reviewed label step does not unlock the merge path this PR describes.

These prompt updates tell the agent to add ambient-code:self-reviewed, but .mergify.yml:1-44 still only queues PRs with approved reviews and never checks that label. So this change will not let self-reviewed PRs merge without human approval.

Also applies to: 303-306

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/amber-issue-handler.yml around lines 107 - 110, The
workflow adds the ambient-code:self-reviewed label but .mergify.yml still
requires human approvals, so update .mergify.yml to recognize that label as an
alternative merge condition: modify the relevant rule(s) (the
approval/required_conditions block in the top rule set) to allow merge when
label:ambient-code:self-reviewed is present (or add a new rule that permits
merging when that label exists), and mirror that change for the other affected
rule(s) referenced in lines 303-306 so the workflow text and mergify policy are
consistent.

_🤖 [Session]($PLATFORM_HOST/projects/$AGENTIC_SESSION_NAMESPACE/sessions/$AGENTIC_SESSION_NAME)_
repos: >-
[{"url": "https://github.com/${{ github.repository }}", "branch": "main"}]
Expand Down Expand Up @@ -299,9 +300,10 @@ jobs:
---
🤖 [Ambient Session]($PLATFORM_HOST/projects/$AGENTIC_SESSION_NAMESPACE/sessions/$AGENTIC_SESSION_NAME)
6. Add the `ambient-code:managed` label to the PR.
7. Ensure CI passes. If it fails, investigate and fix.
8. Do not merge. Leave the PR open for human review.
9. When you comment on the PR, include this footer at the end:
7. Add the `ambient-code:self-reviewed` label to the PR.
8. Ensure CI passes. If it fails, investigate and fix.
9. Do not merge. Leave the PR open for human review.
10. When you comment on the PR, include this footer at the end:
_🤖 [Session]($PLATFORM_HOST/projects/$AGENTIC_SESSION_NAMESPACE/sessions/$AGENTIC_SESSION_NAME)_
repos: >-
[{"url": "https://github.com/${{ github.repository }}", "branch": "main"}]
Expand Down
4 changes: 4 additions & 0 deletions BOOKMARKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@ Frontend technology stack decisions.

Runner SDK design and architecture.

### [ADR-0008: Automate Code Reviews](docs/internal/adr/0008-automate-code-reviews.md)

Automated inner-loop review replaces human code review. PRs with `ambient-code:self-reviewed` label merge via Mergify without human approval.

---

## Development Context
Expand Down
15 changes: 15 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,21 @@ git push --no-verify # Skip pre-push hooks
- `tsc --noEmit` and `npm run build` are **not** included (slow; CI gates on them)
- Branch/push protection scripts remain in `scripts/git-hooks/` and are invoked by pre-commit

## PR Review Gate

Before running `gh pr create`, agents MUST self-review their changes:

1. Review the diff against conventions in this file and [BOOKMARKS.md](BOOKMARKS.md)
2. Verify the changes follow patterns documented in `.claude/patterns/` and `.claude/context/`
3. Check that no `panic()` calls exist in production Go code (use `fmt.Errorf`)
4. Check that no `any` types exist in frontend TypeScript (use proper types, `unknown`, or generics)
5. Ensure all new API endpoints have corresponding frontend proxy routes
6. Verify owner references on any new K8s child resources

A PreToolUse hook (`scripts/hooks/pr-review-gate.sh`) enforces mechanical checks (lint, format, secrets) and runs `coderabbit review --agent --base main` for AI-powered review (security, performance, K8s safety per `.coderabbit.yaml`). The hook will block `gh pr create` if any check fails. The self-review above covers what the hook cannot — architectural fit, convention adherence, and design quality.

When both the self-review and the hook pass, apply the `ambient-code:self-reviewed` label to the PR if the changes were authored and reviewed without human involvement.

## Testing

- **Frontend unit tests**: `cd components/frontend && npx vitest run --coverage` (466 tests, ~74% coverage). See `components/frontend/README.md`.
Expand Down
168 changes: 168 additions & 0 deletions scripts/hooks/pr-review-gate.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
#!/usr/bin/env bash
# pr-review-gate.sh — PreToolUse hook for Bash tool calls.
# Gates `gh pr create` behind mechanical code quality checks and
# CodeRabbit AI review. Stateless: runs checks inline on every attempt.
#
# Exit codes (Claude Code PreToolUse convention):
# 0 = allow the tool call
# 2 = block the tool call (stderr shown to agent as reason)
set -euo pipefail

COMMAND=$(echo "$CLAUDE_TOOL_INPUT" | jq -r '.command // ""')

if ! echo "$COMMAND" | grep -qE '^\s*gh\s+pr\s+create\b'; then
exit 0
fi

echo "PR Review Gate: checking code quality before opening PR..." >&2

REPO_ROOT="$(git rev-parse --show-toplevel)"
BASE_BRANCH="main"
ERRORS=()

block_with_errors() {
echo "" >&2
echo "=================================================" >&2
echo "PR Review Gate: BLOCKED — fix these issues first" >&2
echo "=================================================" >&2
printf '%s\n' "${ERRORS[@]}" >&2
echo "" >&2
echo "Fix these issues and retry gh pr create." >&2
exit 2
}

# Compute diff once — reuse for file list and secret scan
FULL_DIFF=$(git diff "$BASE_BRANCH"...HEAD 2>/dev/null || git diff HEAD~1)
CHANGED_FILES=$(echo "$FULL_DIFF" | grep -E '^diff --git' | sed 's|^diff --git a/.* b/||' || true)

if [ -z "$CHANGED_FILES" ]; then
echo "PR Review Gate: no changed files detected, allowing" >&2
exit 0
fi

# ── Go checks ─────────────────────────────────────────────────────────
GO_FILES=$(echo "$CHANGED_FILES" | grep '\.go$' || true)
if [ -n "$GO_FILES" ]; then
GO_MODULES=$(echo "$GO_FILES" | while read -r f; do
dir=$(dirname "$f")
while [ "$dir" != "." ]; do
if [ -f "$REPO_ROOT/$dir/go.mod" ]; then
echo "$dir"
break
fi
dir=$(dirname "$dir")
done
done | sort -u)

for mod in $GO_MODULES; do
if command -v gofmt &>/dev/null; then
MOD_GO_FILES=$(echo "$GO_FILES" | grep "^$mod/" || true)
if [ -n "$MOD_GO_FILES" ]; then
UNFORMATTED=$(cd "$REPO_ROOT" && echo "$MOD_GO_FILES" | xargs gofmt -l 2>/dev/null || true)
if [ -n "$UNFORMATTED" ]; then
ERRORS+=("gofmt: unformatted files:" "$UNFORMATTED")
fi
fi
fi

if command -v go &>/dev/null; then
VET_OUTPUT=$(cd "$REPO_ROOT/$mod" && go vet ./... 2>&1) || \
ERRORS+=("go vet failed in ${mod}:" "$VET_OUTPUT")
fi
Comment on lines +58 to +71
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n 'command -v gofmt|command -v go\b|node_modules|command -v ruff' scripts/hooks/pr-review-gate.sh

Repository: ambient-code/platform

Length of output: 286


🏁 Script executed:

cat scripts/hooks/pr-review-gate.sh

Repository: ambient-code/platform

Length of output: 6873


Missing toolchains silently skip required checks, degrading the gate.

If go, gofmt, ruff, or frontend dependencies are absent, the hook skips those validations and allows gh pr create anyway (lines 58, 68, 90, 106). In a fresh environment without all toolchains installed, the advertised PR gate degrades to best-effort and code that would fail CI checks can be created locally. The script has no validation or initialization to ensure required tools are available—it silently skips checks if they're missing, unlike CodeRabbit which explicitly logs when unavailable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/hooks/pr-review-gate.sh` around lines 58 - 71, The script currently
silently skips checks when toolchains are missing (checks that use "command -v
gofmt" and "command -v go"); update the hook to fail fast and surface missing
tool errors by testing for required tools up front (gofmt, go, ruff, frontend
deps) or immediately after each "command -v" check and appending a clear error
to ERRORS (e.g., include the tool name and context) and/or exiting non-zero;
specifically modify the blocks that reference GO_FILES, REPO_ROOT, mod,
UNFORMATTED, VET_OUTPUT, and ERRORS so that when "command -v <tool>" returns
false you add an ERRORS entry like "<tool> not found: required for <mod or repo
check>" (or abort) rather than silently skipping the validation.

done

# Batch panic() check across all non-test Go files
PROD_GO_FILES=$(echo "$GO_FILES" | grep -v '_test\.go$' || true)
if [ -n "$PROD_GO_FILES" ]; then
PANIC_HITS=$(cd "$REPO_ROOT" && echo "$PROD_GO_FILES" | xargs grep -Hn 'panic(' 2>/dev/null \
| grep -v '//.*panic' | grep -v 'nolint' || true)
if [ -n "$PANIC_HITS" ]; then
ERRORS+=("panic() in production code (use fmt.Errorf):" "$PANIC_HITS")
fi
fi
fi

# ── Frontend checks ──────────────────────────────────────────────────
TS_FILES=$(echo "$CHANGED_FILES" | grep -E '^components/frontend/.*\.(ts|tsx|js|jsx)$' || true)
if [ -n "$TS_FILES" ]; then
FRONTEND_DIR="$REPO_ROOT/components/frontend"

if command -v npx &>/dev/null && [ -d "$FRONTEND_DIR/node_modules" ]; then
RELATIVE_FILES=$(echo "$TS_FILES" | sed 's|^components/frontend/||')
ESLINT_OUTPUT=$(cd "$FRONTEND_DIR" && echo "$RELATIVE_FILES" | xargs npx eslint 2>&1) || \
ERRORS+=("eslint failed on frontend files:" "$ESLINT_OUTPUT")
fi

ANY_HITS=$(cd "$REPO_ROOT" && echo "$TS_FILES" | xargs grep -Hn ': any\b\|<any>\|as any\b' 2>/dev/null \
| grep -v '//.*any\|nolint\|eslint-disable' || true)
if [ -n "$ANY_HITS" ]; then
ERRORS+=("'any' type usage in frontend (use proper types, unknown, or generics):" "$ANY_HITS")
fi
fi

# ── Python checks ────────────────────────────────────────────────────
PY_FILES=$(echo "$CHANGED_FILES" | grep -E '^(components/runners/|scripts/).*\.py$' || true)
if [ -n "$PY_FILES" ]; then
if command -v ruff &>/dev/null; then
RUFF_CHECK=$(cd "$REPO_ROOT" && echo "$PY_FILES" | xargs ruff check 2>&1) || \
ERRORS+=("ruff check failed:" "$RUFF_CHECK")
RUFF_FMT=$(cd "$REPO_ROOT" && echo "$PY_FILES" | xargs ruff format --check 2>&1) || \
ERRORS+=("ruff format failed:" "$RUFF_FMT")
fi
fi

# ── Secret detection ─────────────────────────────────────────────────
SECRET_PATTERNS='(PRIVATE[_-]KEY|SECRET[_-]KEY|API[_-]KEY|PASSWORD|TOKEN)\s*[=:]\s*["'"'"'][^\s]+'
SECRET_HITS=$(echo "$FULL_DIFF" | grep '^\+' | grep -iE "$SECRET_PATTERNS" | head -5 || true)
if [ -n "$SECRET_HITS" ]; then
ERRORS+=("Possible secrets in diff:" "$SECRET_HITS")
fi

# Bail early if mechanical checks failed — don't waste CodeRabbit's time
if [ ${#ERRORS[@]} -gt 0 ]; then
block_with_errors
fi

# ── CodeRabbit AI review ────────────────────────────────────────────
# Uses .coderabbit.yaml config (security, performance, K8s safety checks).
# Skips gracefully if CLI or auth is unavailable.
if command -v coderabbit &>/dev/null; then
echo "PR Review Gate: running CodeRabbit review..." >&2

CR_OUTPUT=$(coderabbit review --agent --base "$BASE_BRANCH" 2>&1 || true)

# --agent emits NDJSON (one JSON object per line) plus non-JSON
# lines like "[error] stopping cli". Filter to valid JSON first.
CR_JSON=$(echo "$CR_OUTPUT" | grep -E '^\{' || true)

CR_ERROR_TYPE=$(echo "$CR_JSON" | jq -r 'select(.type == "error") | .errorType' 2>/dev/null || true)

if [ "$CR_ERROR_TYPE" = "rate_limit" ]; then
CR_WAIT=$(echo "$CR_JSON" | jq -r 'select(.type == "error") | .metadata.waitTime // "unknown"' 2>/dev/null || true)
echo "PR Review Gate: CodeRabbit rate-limited (wait: $CR_WAIT)" >&2
elif [ "$CR_ERROR_TYPE" = "auth" ] || echo "$CR_JSON" | jq -e 'select(.type == "error")' &>/dev/null; then
CR_MSG=$(echo "$CR_JSON" | jq -r 'select(.type == "error") | .message // "unknown error"' 2>/dev/null || true)
echo "PR Review Gate: CodeRabbit skipped ($CR_MSG)" >&2
Comment on lines +140 to +145
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

CR_JSON='{"type":"error","errorType":"internal","message":"boom"}'
CR_ERROR_TYPE=$(printf '%s\n' "$CR_JSON" | jq -r 'select(.type == "error") | .errorType')

if [ "$CR_ERROR_TYPE" = "rate_limit" ]; then
  echo rate_limit
elif [ "$CR_ERROR_TYPE" = "auth" ] || printf '%s\n' "$CR_JSON" | jq -e 'select(.type == "error")' >/dev/null; then
  echo skipped
else
  echo blocked
fi

Repository: ambient-code/platform

Length of output: 72


🏁 Script executed:

cat -n scripts/hooks/pr-review-gate.sh | sed -n '130,155p'

Repository: ambient-code/platform

Length of output: 1642


🏁 Script executed:

# Search for error handling context and PR objective mentions
grep -n -B5 -A5 "rate_limit\|auth\|error.*type\|degrade\|graceful" scripts/hooks/pr-review-gate.sh | head -80

Repository: ambient-code/platform

Length of output: 1489


Non-auth errors are silently treated as skipped instead of blocking.

Line 143's || jq -e 'select(.type == "error")' catches all error types, not just auth. The documented intent (line 128) is to skip gracefully only for CLI/auth issues, but internal/config/runtime errors bypass the gate entirely. The pre-execution test confirms: errorType="internal" results in "skipped".

Suggested fix
-    elif [ "$CR_ERROR_TYPE" = "auth" ] || echo "$CR_JSON" | jq -e 'select(.type == "error")' &>/dev/null; then
+    elif [ "$CR_ERROR_TYPE" = "auth" ]; then
         CR_MSG=$(echo "$CR_JSON" | jq -r 'select(.type == "error") | .message // "unknown error"' 2>/dev/null || true)
         echo "PR Review Gate: CodeRabbit skipped ($CR_MSG)" >&2
+    elif echo "$CR_JSON" | jq -e 'select(.type == "error")' &>/dev/null; then
+        CR_MSG=$(echo "$CR_JSON" | jq -r 'select(.type == "error") | .message // "unknown error"' 2>/dev/null || true)
+        ERRORS+=("CodeRabbit review failed:" "$CR_MSG")
     elif [ -n "$CR_JSON" ]; then
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if [ "$CR_ERROR_TYPE" = "rate_limit" ]; then
CR_WAIT=$(echo "$CR_JSON" | jq -r 'select(.type == "error") | .metadata.waitTime // "unknown"' 2>/dev/null || true)
echo "PR Review Gate: CodeRabbit rate-limited (wait: $CR_WAIT)" >&2
elif [ "$CR_ERROR_TYPE" = "auth" ] || echo "$CR_JSON" | jq -e 'select(.type == "error")' &>/dev/null; then
CR_MSG=$(echo "$CR_JSON" | jq -r 'select(.type == "error") | .message // "unknown error"' 2>/dev/null || true)
echo "PR Review Gate: CodeRabbit skipped ($CR_MSG)" >&2
if [ "$CR_ERROR_TYPE" = "rate_limit" ]; then
CR_WAIT=$(echo "$CR_JSON" | jq -r 'select(.type == "error") | .metadata.waitTime // "unknown"' 2>/dev/null || true)
echo "PR Review Gate: CodeRabbit rate-limited (wait: $CR_WAIT)" >&2
elif [ "$CR_ERROR_TYPE" = "auth" ]; then
CR_MSG=$(echo "$CR_JSON" | jq -r 'select(.type == "error") | .message // "unknown error"' 2>/dev/null || true)
echo "PR Review Gate: CodeRabbit skipped ($CR_MSG)" >&2
elif echo "$CR_JSON" | jq -e 'select(.type == "error")' &>/dev/null; then
CR_MSG=$(echo "$CR_JSON" | jq -r 'select(.type == "error") | .message // "unknown error"' 2>/dev/null || true)
ERRORS+=("CodeRabbit review failed:" "$CR_MSG")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/hooks/pr-review-gate.sh` around lines 140 - 145, The current elif
condition treats any .type=="error" as an auth/CLI skip because of the `|| echo
"$CR_JSON" | jq -e 'select(.type == "error")'` check; update the conditional so
only auth/CLI errors are skipped (e.g., rely solely on CR_ERROR_TYPE == "auth"
or explicitly test a specific metadata reason in CR_JSON like `jq -e
'select(.metadata.reason=="auth" or .metadata.reason=="cli")'`) and remove the
broad `.type == "error"` test; keep the CR_MSG extraction logic (CR_JSON/CR_MSG)
and ensure other error types (internal/config/runtime) fall through to the
rate-limit/error handling path instead of being treated as skipped.

elif [ -n "$CR_JSON" ]; then
BLOCKING=$(echo "$CR_JSON" | jq -r \
'select(.type == "finding" or .findings != null) |
(.findings[]? // .) | select(.severity == "error") |
" \(.file):\(.line) — \(.message)"' \
2>/dev/null || true)

if [ -n "$BLOCKING" ]; then
ERRORS+=("CodeRabbit found blocking issues:" "$BLOCKING")
else
echo "PR Review Gate: CodeRabbit review passed" >&2
fi
fi
else
echo "PR Review Gate: CodeRabbit CLI not found — skipping AI review" >&2
fi

if [ ${#ERRORS[@]} -gt 0 ]; then
block_with_errors
fi

echo "PR Review Gate: all checks passed" >&2
exit 0