Skip to content

ci: reduce PR feedback loop with targeted caching#1294

Open
jeremyeder wants to merge 6 commits intoambient-code:mainfrom
jeremyeder:feature/ci-improvements
Open

ci: reduce PR feedback loop with targeted caching#1294
jeremyeder wants to merge 6 commits intoambient-code:mainfrom
jeremyeder:feature/ci-improvements

Conversation

@jeremyeder
Copy link
Copy Markdown
Contributor

@jeremyeder jeremyeder commented Apr 11, 2026

Summary

  • E2E Docker layer caching: Replace plain docker build with docker/build-push-action@v7 using GHA cache. Reads layers from the components-build-deploy workflow's cache scopes (frontend-amd64, etc.) so E2E gets warm layers from the last main build. Expected savings: 3-5 min.
  • E2E kind binary caching: Cache the kind binary between runs with actions/cache@v4. Pin version in env var for cache key stability. ~15s saved per run.
  • Lint golangci-lint consolidation: Replace two sequential golangci-lint passes (default + test tags) with a single pass using --build-tags=test (superset). ~30s saved.
  • Unit tests pipx: Replace uncached pip install junit2html with pipx run junit2html (pre-installed on GHA runners). ~10s saved.

Current PR wall-clock P50: ~10.4m (E2E bottleneck)
Expected PR wall-clock P50: ~5-7m

Test plan

  • Verify E2E workflow runs successfully with cached builds on a PR that changes component code
  • Verify E2E workflow still pulls :latest for unchanged components
  • Verify lint workflow passes with single golangci-lint pass
  • Verify unit-tests backend job generates HTML report via pipx
  • Confirm GHA cache scopes don't conflict with components-build-deploy

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Chores

    • Optimized CI/CD workflows to reduce PR feedback latency through improved caching mechanisms for build artifacts and binary dependencies
    • Consolidated linting workflow to a single consolidated pass for improved efficiency
    • Enhanced unit test report generation performance
  • Documentation

    • Added CI improvements specification and implementation plan documenting workflow optimization strategies

Ambient Code Bot and others added 6 commits April 11, 2026 00:28
Targeted caching wins across all PR workflows plus E2E image reuse
via shared GHA BuildKit cache scopes. Goal: cut PR wall-clock from
~10.4m P50 to ~5-7m.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4 tasks: E2E Docker layer caching (shared scopes from components-build),
kind binary caching, golangci-lint consolidation, junit2html via pipx.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test build tag is a superset: linting with --build-tags=test covers
all production files plus test-tagged files. Reduces CI runtime by
eliminating redundant pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace monolithic shell script with individual docker/build-push-action
steps. Each component build now reads GHA cache from components-build-deploy
workflow (scope: {component}-amd64) and writes to e2e-specific scope
(scope: e2e-{component}) for future runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add workflow-level KIND_VERSION env var and use actions/cache@v4 to
store the kind binary in ~/k8s-tools/kind. Cache key includes OS and
version for invalidation on upgrades. Avoids redundant downloads on
every workflow run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 11, 2026

📝 Walkthrough

Walkthrough

This PR implements CI workflow optimizations to reduce PR feedback latency by refactoring E2E image builds with docker/build-push-action and caching, consolidating golangci-lint passes, caching the kind binary, and replacing pip-based junit2html installation with pipx across three core workflows.

Changes

Cohort / File(s) Summary
E2E Workflow Refactoring
.github/workflows/e2e.yml
Replaces inline conditional docker build/pull logic with separate docker/build-push-action@v7 conditional steps for frontend, backend, operator, and ambient-runner images. Adds actions/cache@v4 for kind binary caching and configures KIND_VERSION at workflow level to replace fixed v0.27.0 download pin.
Linting Consolidation
.github/workflows/lint.yml
Consolidates two golangci-lint passes (default build context and --build-tags=test) into a single step with --build-tags=test argument, reducing duplicate linting invocations in go-backend job.
Test Report Simplification
.github/workflows/unit-tests.yml
Replaces pip install junit2html followed by invocation with direct pipx run junit2html for HTML test report generation, reducing installation overhead.
CI Improvements Documentation
docs/superpowers/plans/2026-04-11-ci-improvements.md, docs/superpowers/specs/2026-04-11-ci-improvements-design.md
Adds implementation plan and design specification documents detailing workflow optimizations, caching strategies, BuildKit integration, kind binary caching, linting consolidation, and expected PR feedback latency reductions across workflows.
🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title uses Conventional Commits format (ci: scope) and accurately summarizes the PR's main objective of reducing PR feedback latency through targeted caching improvements.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Performance And Algorithmic Complexity ✅ Passed PR contains only CI/CD workflow YAML and documentation changes with performance optimizations—no algorithmic complexity concerns introduced.
Security And Secret Handling ✅ Passed No blocking security violations: hardcoded secrets absent, environment variables contain only version strings, mock test credentials appropriate for CI, public registries used, proper variable quoting prevents injection.
Kubernetes Resource Safety ✅ Passed PR modifies only GitHub Actions workflows and documentation; Kubernetes Resource Safety check does not apply to CI/CD automation and docs-only changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch feature/ci-improvements
✨ Simplify code
  • Create PR with simplified code

Comment @coderabbitai help to get the list of available commands and usage tips.

@jeremyeder jeremyeder enabled auto-merge (squash) April 11, 2026 05:02
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/lint.yml (1)

144-149: ⚠️ Potential issue | 🟠 Major

Backend lint skips production-only files with --build-tags=test.

components/backend/handlers/k8s_clients_for_request_prod.go has //go:build !test and won't be linted by the current invocation. Add a second pass without build-tag restriction to catch production-only code.

Proposed fix
      - name: Run golangci-lint (all build tags)
        uses: golangci/golangci-lint-action@v9
        with:
          version: latest
          working-directory: components/backend
          args: --timeout=5m --build-tags=test
+      - name: Run golangci-lint (default build tags)
+        uses: golangci/golangci-lint-action@v9
+        with:
+          version: latest
+          working-directory: components/backend
+          args: --timeout=5m
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/lint.yml around lines 144 - 149, CI currently runs
golangci-lint only with --build-tags=test, which skips production-only files
like components/backend/handlers/k8s_clients_for_request_prod.go (//go:build
!test); add a second golangci-lint step that also runs in the same
working-directory (components/backend) but without --build-tags (or with an
explicit empty/omitted build-tags arg) so production-only files are linted too,
keeping the existing step for test-tagged checks; target the existing action
block name "Run golangci-lint (all build tags)" or add a new step named e.g.
"Run golangci-lint (no build tags)" to make this explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/e2e.yml:
- Around line 12-17: Add an explicit permissions block at the top-level of this
workflow (near the existing env/KIND_VERSION and concurrency keys) that scopes
GITHUB_TOKEN to only the least-privilege scopes required for the job (do NOT
rely on default write); for example set only the specific permissions you need
such as contents: read, actions: read, id-token: write, or packages: write as
appropriate for your cache/action usage, and remove any broader write defaults.
Also pin every third-party action used in the workflow to an exact commit SHA
(instead of floating tags) so the run is hermetic and auditable.
- Around line 95-96: Replace all mutable action tags with immutable commit SHAs
for each action usage (e.g., replace docker/build-push-action@v7 and the other
`@v6/`@v4/@v3 occurrences with their corresponding full commit SHAs) so every
"uses:" entry is pinned; update the 12 action instances referenced (including
docker/build-push-action and the other actions in the file) to their exact SHA
values. Also add a top-level permissions: block to the workflow that scopes the
GITHUB_TOKEN to the minimal required permissions for this workflow (declare only
the specific permission keys needed, e.g., read/write for specific resources
used), ensuring the token is not granted broad defaults. Ensure changes touch
the workflow root (top-level) and every "uses:" line that currently has a
version tag.

---

Outside diff comments:
In @.github/workflows/lint.yml:
- Around line 144-149: CI currently runs golangci-lint only with
--build-tags=test, which skips production-only files like
components/backend/handlers/k8s_clients_for_request_prod.go (//go:build !test);
add a second golangci-lint step that also runs in the same working-directory
(components/backend) but without --build-tags (or with an explicit empty/omitted
build-tags arg) so production-only files are linted too, keeping the existing
step for test-tagged checks; target the existing action block name "Run
golangci-lint (all build tags)" or add a new step named e.g. "Run golangci-lint
(no build tags)" to make this explicit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a3e90040-f7a9-491d-bf5b-3d9c35e7ab58

📥 Commits

Reviewing files that changed from the base of the PR and between 8a2310a and bcdfbc7.

📒 Files selected for processing (5)
  • .github/workflows/e2e.yml
  • .github/workflows/lint.yml
  • .github/workflows/unit-tests.yml
  • docs/superpowers/plans/2026-04-11-ci-improvements.md
  • docs/superpowers/specs/2026-04-11-ci-improvements-design.md

Comment on lines +12 to 17
env:
KIND_VERSION: "v0.27.0"

concurrency:
group: e2e-tests-${{ github.event.pull_request.number || github.sha }}
cancel-in-progress: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add explicit least-privilege permissions for this workflow.

The workflow introduces cache writes and multiple third-party actions but does not scope GITHUB_TOKEN permissions explicitly.

Suggested hardening baseline
 env:
   KIND_VERSION: "v0.27.0"
 
+permissions:
+  contents: read
+  actions: write
+
 concurrency:
   group: e2e-tests-${{ github.event.pull_request.number || github.sha }}
   cancel-in-progress: true

As per coding guidelines, .github/workflows/**/*.{yml,yaml}: - Pin action versions to SHA. Verify secrets are not exposed and permissions are scoped.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
env:
KIND_VERSION: "v0.27.0"
concurrency:
group: e2e-tests-${{ github.event.pull_request.number || github.sha }}
cancel-in-progress: true
env:
KIND_VERSION: "v0.27.0"
permissions:
contents: read
actions: write
concurrency:
group: e2e-tests-${{ github.event.pull_request.number || github.sha }}
cancel-in-progress: true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/e2e.yml around lines 12 - 17, Add an explicit permissions
block at the top-level of this workflow (near the existing env/KIND_VERSION and
concurrency keys) that scopes GITHUB_TOKEN to only the least-privilege scopes
required for the job (do NOT rely on default write); for example set only the
specific permissions you need such as contents: read, actions: read, id-token:
write, or packages: write as appropriate for your cache/action usage, and remove
any broader write defaults. Also pin every third-party action used in the
workflow to an exact commit SHA (instead of floating tags) so the run is
hermetic and auditable.

Comment on lines +95 to +96
uses: docker/build-push-action@v7
with:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# List tag-pinned actions in the changed workflow.
rg -nP '^\s*uses:\s*[^@\s]+@v[0-9]+' .github/workflows/e2e.yml

Repository: ambient-code/platform

Length of output: 551


🏁 Script executed:

head -20 .github/workflows/e2e.yml

Repository: ambient-code/platform

Length of output: 405


🏁 Script executed:

sed -n '90,100p' .github/workflows/e2e.yml

Repository: ambient-code/platform

Length of output: 422


🏁 Script executed:

sed -n '110,120p' .github/workflows/e2e.yml

Repository: ambient-code/platform

Length of output: 500


🏁 Script executed:

sed -n '1,40p' .github/workflows/e2e.yml

Repository: ambient-code/platform

Length of output: 1194


🏁 Script executed:

sed -n '170,180p' .github/workflows/e2e.yml

Repository: ambient-code/platform

Length of output: 335


🏁 Script executed:

sed -n '230,245p' .github/workflows/e2e.yml

Repository: ambient-code/platform

Length of output: 584


🏁 Script executed:

grep -n 'permissions:' .github/workflows/e2e.yml

Repository: ambient-code/platform

Length of output: 47


Workflow actions must be pinned to commit SHAs and permissions scoped.

All action usages rely on mutable version tags (@v6, @v7, @v4, @v3) instead of immutable commit SHAs. This affects 12 action instances: lines 32, 37, 60, 78, 89, 95, 114, 133, 152, 173, 232, and 241.

Additionally, the workflow lacks an explicit top-level permissions: block to scope token access.

Replace version tags with full commit SHAs (e.g., uses: docker/build-push-action@<full-sha>) and add a top-level permissions: block specifying minimal required scopes per the guideline: .github/workflows/**/*.{yml,yaml} must pin action versions to SHA and scope permissions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/e2e.yml around lines 95 - 96, Replace all mutable action
tags with immutable commit SHAs for each action usage (e.g., replace
docker/build-push-action@v7 and the other `@v6/`@v4/@v3 occurrences with their
corresponding full commit SHAs) so every "uses:" entry is pinned; update the 12
action instances referenced (including docker/build-push-action and the other
actions in the file) to their exact SHA values. Also add a top-level
permissions: block to the workflow that scopes the GITHUB_TOKEN to the minimal
required permissions for this workflow (declare only the specific permission
keys needed, e.g., read/write for specific resources used), ensuring the token
is not granted broad defaults. Ensure changes touch the workflow root
(top-level) and every "uses:" line that currently has a version tag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant