Skip to content

feat(gitops-update): manifest-driven topology + anacleto cluster#212

Open
bedatty wants to merge 5 commits intodevelopfrom
feat/gitops-deployment-matrix-anacleto
Open

feat(gitops-update): manifest-driven topology + anacleto cluster#212
bedatty wants to merge 5 commits intodevelopfrom
feat/gitops-deployment-matrix-anacleto

Conversation

@bedatty
Copy link
Copy Markdown
Contributor

@bedatty bedatty commented Apr 13, 2026

Description

Introduces config/deployment-matrix.yaml as the single source of truth for which apps deploy to which Kubernetes clusters via the gitops-update.yml reusable workflow. Adds anacleto as the third deployment target alongside firmino and clotilde.

The workflow now:

  1. Sparse-checks-out the manifest at the same pinned ref as itself (no runtime drift).
  2. Resolves the cluster set from app_name by scanning clusters.<name>.apps.
  3. Applies deploy_in_<cluster> inputs as force-off overrides — they can subtract clusters from the resolved set but cannot add a cluster the manifest doesn't list. Prevents accidental cross-cluster spillover while still allowing emergency containment.

A new src/lint/deployment-matrix composite validates the manifest schema, app/cluster integrity, and duplicates on every PR that touches config/deployment-matrix.yaml (gated job in self-pr-validation.yml, follows the same Python-embedded pattern as composite-schema).

The manifest topology was inferred empirically from the midaz-firmino-gitops repo by cross-referencing folder presence with CI commit history (ci(<app>): update image tags), so only real workflow callers are included — manually-managed apps like underwriter, jd-mock-api, mock-btg-server, control-plane, platform-console, ledger, dockerhub-secret are excluded.

Cluster membership today (18 apps):

  • firmino: 12 apps (core platform + plugins + plugin-br-bank-transfer)
  • clotilde: 17 apps (core + plugins + plugin-br-bank-transfer + Lerian platform suite: backoffice-console, cs-platform, forge, lerian-map, tenant-manager)
  • anacleto: 12 apps (core + plugins + plugin-br-bank-transfer-jd)

Files affected:

  • .github/workflows/gitops-update.yml — manifest loader, resolve_clusters step, force-off semantics
  • config/deployment-matrix.yaml — new manifest (source of truth)
  • src/lint/deployment-matrix/{action.yml,README.md} — new composite validator
  • .github/workflows/self-pr-validation.yml — wires the new lint job
  • src/notify/pr-lint-reporter/{action.yml,README.md} — adds deployment-matrix result row to the PR comment
  • docs/gitops-update-workflow.md — new "Deployment Matrix" section, updated examples, migration guide

Type of Change

  • feat: New workflow or new input/output/step in an existing workflow
  • fix: Bug fix in a workflow
  • perf: Performance improvement
  • refactor: Internal restructuring with no behavior change
  • docs: Documentation only
  • ci: Changes to self-CI
  • chore: Dependency bumps, config updates, maintenance
  • test: Adding or updating tests
  • BREAKING CHANGE: Callers must update their configuration after this PR

Breaking Changes

None. This is fully backward-compatible:

  • Existing callers passing deploy_in_firmino: true, deploy_in_clotilde: true continue working — the inputs are now force-off (true is no-op, false subtracts).
  • An app not yet listed in the manifest logs a warning and exits cleanly (no failure) — the workflow is a graceful no-op for un-onboarded apps.
  • The cluster set for every existing app was preserved 1:1 from the empirical state of the GitOps repo.

Testing

  • YAML syntax validated locally (all 5 modified/created YAMLs pass yaml.safe_load)
  • Validator composite logic smoke-tested locally against the real manifest (18 apps, 3 clusters, 0 violations, 0 warnings) and an intentionally-broken manifest (correctly caught duplicate in registry, missing-from-registry typo, duplicate in cluster, orphan warning)
  • Cluster resolution logic smoke-tested against representative apps including the plugin-br-bank-transfer vs plugin-br-bank-transfer-jd edge case (caught a critical bug: jq contains([$app]) does substring matching on strings — switched to index($app) for exact equality)
  • Triggered a real workflow run on a caller repository using @develop or the beta tag
  • Verified all existing inputs still work with default values (force-off semantics preserves backward compat)
  • Confirmed no secrets or tokens are printed in logs
  • Checked that unrelated workflows are not affected

Caller repo / workflow run: Pending real release after merge.

Related Issues

Closes #

Summary by CodeRabbit

  • New Features
    • Deployment now driven by a centralized manifest to resolve target clusters per app.
    • Added support for the Anacleto cluster and deploy-in- force-off overrides.
  • CI
    • Added deployment-matrix validation job and a reusable lint action to the pipeline.
    • Lint reporting updated to include deployment-matrix results.
  • Documentation
    • Workflow docs and examples updated to reflect manifest-driven behavior and naming.

…to support

Introduce config/deployment-matrix.yaml as the single source of truth for
which apps deploy to which clusters. The workflow now reads the manifest
at the same pinned ref as itself (sparse checkout) and resolves the
cluster set from app_name. Adds anacleto as the third deployment target.

deploy_in_<cluster> inputs become force-off overrides — they subtract
clusters from the manifest-resolved set but cannot add a cluster the
manifest does not list. This prevents accidental cross-cluster spillover
while still allowing emergency containment.

Adds src/lint/deployment-matrix composite (Python embedded, follows the
composite-schema pattern) that validates schema, app/cluster integrity,
duplicates, and orphan apps. Wired into self-pr-validation as a gated
job that only runs when config/deployment-matrix.yaml changes.

The manifest topology was inferred empirically from the GitOps repo by
cross-referencing folder presence with CI commit history — only apps
that are real callers of this workflow are included (excludes apps
managed manually like underwriter, jd-mock-api, mock-btg-server,
control-plane, platform-console, ledger, dockerhub-secret).
@bedatty bedatty requested a review from a team as a code owner April 13, 2026 14:53
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

Walkthrough

Migrates GitOps deployment from caller-specified server flags to manifest-driven cluster resolution via config/deployment-matrix.yml; adds deploy_in_anacleto and deployment_matrix_file inputs, a deployment-matrix lint action and CI job, and changes workflow steps to resolve clusters and apply force-off suppression. Breaking: callers now require the manifest.

Changes

Cohort / File(s) Summary
Core Workflow Refactor
.github/workflows/gitops-update.yml, config/deployment-matrix.yml
Replaced server-flag logic with manifest-driven cluster resolution. Added deploy_in_anacleto and deployment_matrix_file inputs. New steps: resolve reusable-workflow SHA, sparse-checkout the manifest, resolve clusters matching app_name, apply force-off suppression, emit clusters/has_clusters, and derive SERVERS from resolved clusters. Workflow no longer computes deploy_firmino/deploy_clotilde; absence of clusters now exits successfully with empty sync targets. Breaking: requires manifest file and explicit cluster entries.
CI Workflow Updates
.github/workflows/self-pr-validation.yml
Pinned actions/checkout usages to a commit SHA. Added deployment-matrix lint job (runs when config/deployment-matrix.yml is in changed files). Updated lint-report job needs and report inputs to include deployment-matrix results/files.
Deployment Matrix Linter
src/lint/deployment-matrix/action.yml, src/lint/deployment-matrix/README.md
New composite action and README that validate the manifest schema and integrity: version: 1, apps.registry non-empty string list, clusters mapping with per-cluster apps lists, no unknown/referenced apps, and duplicate detection; errors fail the job, orphan apps produce warnings. Installer ensures YAML support.
Lint Reporter Changes
src/notify/pr-lint-reporter/action.yml, src/notify/pr-lint-reporter/README.md
Added inputs deployment-matrix-result and deployment-matrix-files, wired a "Deployment Matrix Lint" check into the generated checks array and file-count parsing.
Documentation
docs/gitops-update-workflow.md
Rewrote documentation to reflect manifest-driven cluster selection, new inputs (deploy_in_anacleto, deployment_matrix_file), redefined deploy_in_<cluster> as force-off only, updated naming/path examples to use <cluster>-<app_name>-<env>, and added deployment-matrix schema/behavior examples.
New Config
config/deployment-matrix.yml
Added canonical deployment matrix with apps.registry and three clusters (firmino, clotilde, anacleto) mapping apps to clusters and inline resolution behavior notes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed Title accurately summarizes the main change: introducing manifest-driven topology and the anacleto cluster to gitops-update.
Description check ✅ Passed PR description covers all template sections: detailed summary of changes, type of change marked as feat, breaking changes explicitly stated as none with backward-compatibility explanation, testing checklist mostly completed with pending real-world validation, and related issues section included.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/gitops-deployment-matrix-anacleto

Comment @coderabbitai help to get the list of available commands and usage tips.

@lerian-studio lerian-studio added size/L PR changes 500–999 lines documentation Improvements or additions to documentation workflow Changes to one or more reusable workflow files labels Apr 13, 2026
@lerian-studio
Copy link
Copy Markdown

lerian-studio commented Apr 13, 2026

🛡️ CodeQL Analysis Results

Languages analyzed: actions

Found 2 issue(s): 2 Medium

Severity Rule File Message
🟡 Medium actions/untrusted-checkout/medium .github/workflows/gitops-update.yml:96 Potential unsafe checkout of untrusted pull request on privileged workflow.
🟡 Medium actions/untrusted-checkout/medium .github/workflows/gitops-update.yml:121 Potential unsafe checkout of untrusted pull request on privileged workflow.

🔍 View full scan logs | 🛡️ Security tab

@lerian-studio
Copy link
Copy Markdown

lerian-studio commented Apr 13, 2026

🔍 Lint Analysis

Check Files Scanned Status
YAML Lint 5 file(s) ✅ success
Action Lint 2 file(s) ✅ success
Pinned Actions 4 file(s) ✅ success
Markdown Link Check 3 file(s) ✅ success
Spelling Check 8 file(s) ✅ success
Shell Check 4 file(s) ✅ success
README Check 4 file(s) ✅ success
Composite Schema 2 file(s) ✅ success
Deployment Matrix 1 file(s) ✅ success

🔍 View full scan logs

…tution

Address PR #212 lint failures:

1. Pin all `actions/checkout@v6` occurrences in self-pr-validation.yml to
   the SHA already used in gitops-update.yml. Required by pinned-actions
   lint for external (non-LerianStudio) actions. Also clears pre-existing
   tech debt in this file that surfaced because the new deployment-matrix
   job touched it.

2. Replace `echo "$RESOLVED" | sed 's/^/  - /'` with a bash `while read`
   loop in the resolve_clusters step. Fixes shellcheck SC2001
   (prefer bash parameter expansion over sed for simple substitutions).
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/gitops-update-workflow.md (1)

199-209: ⚠️ Potential issue | 🟡 Minor

Finish the terminology migration from server to cluster.

These sections still document the path placeholders, sync naming, and examples in terms of <server> / “selected servers”, even though the workflow now resolves clusters from the deployment matrix. That leaves the docs internally inconsistent right where callers look for path and ArgoCD naming rules.

As per coding guidelines, docs/gitops-update-workflow.md must update naming/path conventions to use <cluster> in GitOps paths and ArgoCD app naming (not the old single-server <server> terminology).

Also applies to: 240-263

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/gitops-update-workflow.md` around lines 199 - 209, The docs still use
the old placeholder `<server>` and phrases like “selected servers” in the GitOps
path and ArgoCD naming examples; update all occurrences in
docs/gitops-update-workflow.md to use `<cluster>` and “selected clusters”
instead, e.g. change the path template to
gitops/environments/<cluster>/helmfile/applications/<env>/<app_name>/values.yaml
and adjust any ArgoCD app naming examples to reference cluster (also apply the
same replacements in the later section referenced as lines 240-263); ensure
wording and examples are consistent with the deployment matrix terminology and
keep the `<env>` values unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/gitops-update.yml:
- Around line 195-198: The command that sets RESOLVED currently silences errors
with "|| true", masking parse/query failures as an empty RESOLVED; remove the
"|| true" fallback so yq/jq errors propagate and fail the job, i.e. change the
RESOLVED assignment (the pipeline using yq -o=json '.' "$MANIFEST" | jq -r --arg
app "$APP_NAME" '.clusters | to_entries[] | select(.value.apps // [] |
index($app)) | .key' | sort -u) to not append "|| true"; ensure the rest of the
script still treats an actual empty RESOLVED as the legitimate “no matching
clusters” case and let the pipeline fail on real yq/jq errors.

In `@config/deployment-matrix.yaml`:
- Around line 1-109: The file was added with a .yaml extension which violates
the repo convention; rename deployment-matrix.yaml to deployment-matrix.yml and
update all references in this PR (workflow inputs, docs, and any helper scripts)
so the new path is used; ensure the renamed manifest still contains the same
top-level keys (version, apps.registry, clusters) and that any CI/workflow that
reads the file (gitops-update.yml or helper script) is updated to point to
config/deployment-matrix.yml before merging.

---

Outside diff comments:
In `@docs/gitops-update-workflow.md`:
- Around line 199-209: The docs still use the old placeholder `<server>` and
phrases like “selected servers” in the GitOps path and ArgoCD naming examples;
update all occurrences in docs/gitops-update-workflow.md to use `<cluster>` and
“selected clusters” instead, e.g. change the path template to
gitops/environments/<cluster>/helmfile/applications/<env>/<app_name>/values.yaml
and adjust any ArgoCD app naming examples to reference cluster (also apply the
same replacements in the later section referenced as lines 240-263); ensure
wording and examples are consistent with the deployment matrix terminology and
keep the `<env>` values unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9dc319dc-e1d1-407d-8f29-4e3e18ff298f

📥 Commits

Reviewing files that changed from the base of the PR and between c2a818d and ab72e57.

📒 Files selected for processing (8)
  • .github/workflows/gitops-update.yml
  • .github/workflows/self-pr-validation.yml
  • config/deployment-matrix.yaml
  • docs/gitops-update-workflow.md
  • src/lint/deployment-matrix/README.md
  • src/lint/deployment-matrix/action.yml
  • src/notify/pr-lint-reporter/README.md
  • src/notify/pr-lint-reporter/action.yml

Resolves 6 medium-severity findings from github-advanced-security:

CODE INJECTION (4 findings — actions/code-injection/medium):
- Move `${{ github.workflow_ref }}` to step env: WORKFLOW_REF
  - Bonus: replace `echo | sed -E 's|.*@||'` with bash `${VAR##*@}`
  - Eliminates injection vectors at lines 106 + 108
- Move resolve_clusters outputs (has_clusters, clusters) to step env:
  HAS_CLUSTERS + RESOLVED_SERVERS in apply_tags step
- Move inputs.yaml_key_mappings + inputs.configmap_updates to step env:
  MAPPINGS + CONFIGMAP_MAPPINGS
- Replace `${{ env.IS_BETA/RC/PRODUCTION/SANDBOX }}` with direct
  `$IS_BETA/...` (already in job-level env, no need to re-interpolate)
- Replace `${{ github.ref }}` with `${GITHUB_REF}` (auto-set by runner)

UNTRUSTED CHECKOUT (2 findings — actions/untrusted-checkout/medium):
- Add `persist-credentials: false` to manifest sparse checkout (read-only,
  no credentials needed, never executes code from this checkout)
- Document trust model inline for the GitOps repo checkout (workflow_call
  is not triggered by untrusted PRs; inputs.gitops_repository comes from
  trusted internal callers; MANAGE_TOKEN is required for the subsequent
  commit/push step, so we cannot drop persist-credentials there)
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

♻️ Duplicate comments (1)
.github/workflows/gitops-update.yml (1)

195-198: ⚠️ Potential issue | 🟠 Major

Do not swallow manifest/query failures here.

The trailing || true still turns malformed YAML, missing tools, or a broken jq query into the same empty-resolution path as “app not registered”, so the workflow exits successfully instead of surfacing a real topology failure.

Suggested fix
           RESOLVED=$(yq -o=json '.' "$MANIFEST" \
             | jq -r --arg app "$APP_NAME" \
               '.clusters | to_entries[] | select(.value.apps // [] | index($app)) | .key' \
-            | sort -u || true)
+            | sort -u)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/gitops-update.yml around lines 195 - 198, The RESOLVED
assignment is swallowing failures because of the trailing "|| true"; update the
RESOLVED logic (the pipeline using yq/jq with MANIFEST and APP_NAME) to detect
and propagate real errors instead of converting them to an empty value: remove
the "|| true", capture and check the pipeline/command exit status (or enable
errexit for the step) and, on error, log the failing command and exit non‑zero
so malformed YAML, missing tools, or bad jq queries fail the workflow rather
than being treated as “app not registered.”
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/gitops-update.yml:
- Around line 99-119: The code currently parses github.workflow_ref (which
points to the caller) and may checkout the wrong ref; add a required workflow
input named workflow_ref (default callers should pass ${{ github.workflow_ref }}
or their known pinned ref) and replace the sed parsing logic with using that
input directly: stop extracting REF from github.workflow_ref in the Resolve
shared-workflows ref step and instead set ref="${{ inputs.workflow_ref }}" (or
assign it to GITHUB_OUTPUT via steps.shared_ref) so the Checkout deployment
matrix manifest step uses the passed workflow_ref; update the reusable workflow
inputs to include workflow_ref and adjust any references to
steps.shared_ref.outputs.ref or similar to source the new input.

---

Duplicate comments:
In @.github/workflows/gitops-update.yml:
- Around line 195-198: The RESOLVED assignment is swallowing failures because of
the trailing "|| true"; update the RESOLVED logic (the pipeline using yq/jq with
MANIFEST and APP_NAME) to detect and propagate real errors instead of converting
them to an empty value: remove the "|| true", capture and check the
pipeline/command exit status (or enable errexit for the step) and, on error, log
the failing command and exit non‑zero so malformed YAML, missing tools, or bad
jq queries fail the workflow rather than being treated as “app not registered.”
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5a5313c9-d193-45b6-9ffd-2619d0b198e0

📥 Commits

Reviewing files that changed from the base of the PR and between ab72e57 and 23740d0.

📒 Files selected for processing (2)
  • .github/workflows/gitops-update.yml
  • .github/workflows/self-pr-validation.yml

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/gitops-update.yml:
- Around line 18-29: The deploy_in_* inputs (deploy_in_firmino,
deploy_in_clotilde, deploy_in_anacleto) were changed to "force-off only"
semantics which can silently break callers who set them to true expecting to add
clusters; update the code and docs to mitigate this by (1) documenting the
semantic change in the migration guide and calling out that apps must be listed
in the deployment matrix to be deployed, and (2) adding runtime warnings in the
deployment input-processing logic that detect when a caller passed
deploy_in_<cluster>: true but the app is not present in that cluster's manifest
(emit a clear warning mentioning the cluster and app name); implement the
warning near the existing validation that currently warns only when an app is in
no cluster so it reuses the same validation flow and message style.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2b5758c8-abf1-471d-ab63-9475e540737a

📥 Commits

Reviewing files that changed from the base of the PR and between 23740d0 and ebb48fd.

📒 Files selected for processing (1)
  • .github/workflows/gitops-update.yml

1. [CRITICAL] Replace `github.workflow_ref` with `github.job_workflow_sha`
   for manifest checkout. In reusable workflows, github.workflow_ref points
   to the CALLER's workflow file/ref, not the called reusable workflow —
   my previous design would have failed for every external caller.
   `job_workflow_sha` is the commit SHA of the running reusable workflow,
   which is exactly what we need. Bonus: SHA is more secure than textual
   ref, and removes the need for the `Resolve shared-workflows ref` step
   entirely (−18 lines).

2. [HIGH] Remove `|| true` from the RESOLVED pipeline. Silenced yq/jq
   failures would collapse into the "app not registered" warning path,
   hiding real manifest/query errors. Now fails fast on parse errors;
   empty RESOLVED from a successful query remains the legitimate
   "no matching clusters" case (handled explicitly below).

3. [MEDIUM] Rename config/deployment-matrix.yaml → .yml to match the
   repo convention (77 .yml files vs 2 .yaml). Updated all references:
   workflow input default, self-pr-validation gate, composite default,
   README docs, and the workflow doc.

4. [LOW] Add prominent migration callout to docs about deploy_in_*
   semantic change — apps must be in the manifest; inputs only subtract.

Declined: per-cluster warning when deploy_in_<cluster>: true but app
is absent from that cluster's manifest list. Inputs default to true, so
this would fire for every app missing from any cluster on every run —
noise without signal. Existing "app in zero clusters" warning already
covers the actionable case.
actionlint v1.7.x (pinned via raven-actions/actionlint@v2.1.2) does not
yet include `github.job_workflow_sha` in its GitHub context schema,
triggering a false-positive "property not defined" error on the previous
direct reference.

Replace the inline `${{ github.job_workflow_sha }}` expression with an
intermediate step that reads the equivalent auto-set env var
GITHUB_JOB_WORKFLOW_SHA and exports it as a step output. Functionally
identical (the runner populates both from the same source) but the
`steps.X.outputs.Y` expression is recognized by actionlint.

Also adds a defensive guard that fails fast if GITHUB_JOB_WORKFLOW_SHA
is empty — which would mean the workflow is being called outside a
reusable-workflow context, catching that misconfiguration loudly.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
.github/workflows/gitops-update.yml (2)

7-33: ⚠️ Potential issue | 🟠 Major

Add the required dry_run input before merging.

This reusable workflow still mutates the GitOps repo and can trigger ArgoCD sync, but on.workflow_call.inputs does not expose a dry_run boolean. That leaves callers without the required preview path for the new manifest-driven rollout. As per coding guidelines, "Every reusable workflow must support workflow_call trigger, expose explicit inputs, and always include a dry_run input (type: boolean, default: false)."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/gitops-update.yml around lines 7 - 33, The workflow is
missing the required dry_run input under workflow_call.inputs; add a new boolean
input named dry_run with type: boolean and default: false alongside the existing
inputs (e.g., near gitops_repository, app_name, deploy_in_firmino, etc.) so
callers can opt into preview mode; ensure the input key is exactly dry_run and
its default is false to satisfy the reusable-workflow guideline and avoid
mutating the GitOps repo when callers request a dry run.

71-83: ⚠️ Potential issue | 🟠 Major

Declare an explicit workflow-level permissions: block.

permissions is still undeclared, so this workflow inherits repository defaults. For a shared reusable workflow that checks out repositories and downloads artifacts, that is broader than the repo baseline and should be pinned to least privilege explicitly. As per coding guidelines, "Always declare explicit permissions: block at workflow level; default to contents: read; only escalate per-job when needed."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/gitops-update.yml around lines 71 - 83, Add an explicit
workflow-level permissions block to the reusable workflow so it no longer
inherits repo defaults: declare permissions: contents: read (and any other
least-privilege scopes required) at the top-level of the workflow file that
contains the update_gitops job, then if a specific step or job (e.g.,
update_gitops job, steps: checkout or artifact download) needs extra scopes,
escalate only within that job via a per-job permissions override; ensure the
workflow-level block is present before jobs: and limits privileges to the
minimum.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/gitops-update.yml:
- Around line 215-220: The current "no-cluster" branches set has_clusters=false
in the resolve step (writing to GITHUB_OUTPUT) but downstream artifact
download/validation and update/commit steps still run and can fail before the
no-op is honored; update the workflow so any steps that download artifacts,
validate, or perform updates (the artifact/download/validate and commit steps)
are conditional on steps.resolve_clusters.outputs.has_clusters == 'true' (or
move an early conditional/if that exits the job immediately when RESOLVED is
empty or has_clusters == 'false'), and ensure the resolve step writes the output
key has_clusters consistently (GITHUB_OUTPUT) so the later if conditions (e.g.,
uses of steps.resolve_clusters.outputs.has_clusters) can gate those steps and
prevent them from running when an app is not in any cluster.

---

Outside diff comments:
In @.github/workflows/gitops-update.yml:
- Around line 7-33: The workflow is missing the required dry_run input under
workflow_call.inputs; add a new boolean input named dry_run with type: boolean
and default: false alongside the existing inputs (e.g., near gitops_repository,
app_name, deploy_in_firmino, etc.) so callers can opt into preview mode; ensure
the input key is exactly dry_run and its default is false to satisfy the
reusable-workflow guideline and avoid mutating the GitOps repo when callers
request a dry run.
- Around line 71-83: Add an explicit workflow-level permissions block to the
reusable workflow so it no longer inherits repo defaults: declare permissions:
contents: read (and any other least-privilege scopes required) at the top-level
of the workflow file that contains the update_gitops job, then if a specific
step or job (e.g., update_gitops job, steps: checkout or artifact download)
needs extra scopes, escalate only within that job via a per-job permissions
override; ensure the workflow-level block is present before jobs: and limits
privileges to the minimum.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9dbe78ac-275b-42a0-a723-793548e8c828

📥 Commits

Reviewing files that changed from the base of the PR and between efa0a60 and e2c774c.

📒 Files selected for processing (1)
  • .github/workflows/gitops-update.yml

Comment on lines +215 to +220
if [[ -z "$RESOLVED" ]]; then
echo "::warning::App '$APP_NAME' is not registered in any cluster of the deployment matrix."
echo "If this is a new app, add it to $MANIFEST in shared-workflows."
echo "clusters=" >> "$GITHUB_OUTPUT"
echo "has_clusters=false" >> "$GITHUB_OUTPUT"
exit 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The new zero-cluster path does not actually exit cleanly.

These branches set has_clusters=false, but nothing consumes that until Line 368. The artifact download/validation steps still run first, so an app that is missing from the matrix—or fully suppressed by the force-off inputs—can still fail before the intended no-op path is reached. Gate the artifact/update/commit steps on steps.resolve_clusters.outputs.has_clusters == 'true', or branch earlier so the job stops immediately.

Also applies to: 257-261, 366-372

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/gitops-update.yml around lines 215 - 220, The current
"no-cluster" branches set has_clusters=false in the resolve step (writing to
GITHUB_OUTPUT) but downstream artifact download/validation and update/commit
steps still run and can fail before the no-op is honored; update the workflow so
any steps that download artifacts, validate, or perform updates (the
artifact/download/validate and commit steps) are conditional on
steps.resolve_clusters.outputs.has_clusters == 'true' (or move an early
conditional/if that exits the job immediately when RESOLVED is empty or
has_clusters == 'false'), and ensure the resolve step writes the output key
has_clusters consistently (GITHUB_OUTPUT) so the later if conditions (e.g., uses
of steps.resolve_clusters.outputs.has_clusters) can gate those steps and prevent
them from running when an app is not in any cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size/L PR changes 500–999 lines workflow Changes to one or more reusable workflow files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants