Conversation
…to support Introduce config/deployment-matrix.yaml as the single source of truth for which apps deploy to which clusters. The workflow now reads the manifest at the same pinned ref as itself (sparse checkout) and resolves the cluster set from app_name. Adds anacleto as the third deployment target. deploy_in_<cluster> inputs become force-off overrides — they subtract clusters from the manifest-resolved set but cannot add a cluster the manifest does not list. This prevents accidental cross-cluster spillover while still allowing emergency containment. Adds src/lint/deployment-matrix composite (Python embedded, follows the composite-schema pattern) that validates schema, app/cluster integrity, duplicates, and orphan apps. Wired into self-pr-validation as a gated job that only runs when config/deployment-matrix.yaml changes. The manifest topology was inferred empirically from the GitOps repo by cross-referencing folder presence with CI commit history — only apps that are real callers of this workflow are included (excludes apps managed manually like underwriter, jd-mock-api, mock-btg-server, control-plane, platform-console, ledger, dockerhub-secret).
…tution Address PR #212 lint failures: 1. Pin all `actions/checkout@v6` occurrences in self-pr-validation.yml to the SHA already used in gitops-update.yml. Required by pinned-actions lint for external (non-LerianStudio) actions. Also clears pre-existing tech debt in this file that surfaced because the new deployment-matrix job touched it. 2. Replace `echo "$RESOLVED" | sed 's/^/ - /'` with a bash `while read` loop in the resolve_clusters step. Fixes shellcheck SC2001 (prefer bash parameter expansion over sed for simple substitutions).
Resolves 6 medium-severity findings from github-advanced-security:
CODE INJECTION (4 findings — actions/code-injection/medium):
- Move `${{ github.workflow_ref }}` to step env: WORKFLOW_REF
- Bonus: replace `echo | sed -E 's|.*@||'` with bash `${VAR##*@}`
- Eliminates injection vectors at lines 106 + 108
- Move resolve_clusters outputs (has_clusters, clusters) to step env:
HAS_CLUSTERS + RESOLVED_SERVERS in apply_tags step
- Move inputs.yaml_key_mappings + inputs.configmap_updates to step env:
MAPPINGS + CONFIGMAP_MAPPINGS
- Replace `${{ env.IS_BETA/RC/PRODUCTION/SANDBOX }}` with direct
`$IS_BETA/...` (already in job-level env, no need to re-interpolate)
- Replace `${{ github.ref }}` with `${GITHUB_REF}` (auto-set by runner)
UNTRUSTED CHECKOUT (2 findings — actions/untrusted-checkout/medium):
- Add `persist-credentials: false` to manifest sparse checkout (read-only,
no credentials needed, never executes code from this checkout)
- Document trust model inline for the GitOps repo checkout (workflow_call
is not triggered by untrusted PRs; inputs.gitops_repository comes from
trusted internal callers; MANAGE_TOKEN is required for the subsequent
commit/push step, so we cannot drop persist-credentials there)
1. [CRITICAL] Replace `github.workflow_ref` with `github.job_workflow_sha` for manifest checkout. In reusable workflows, github.workflow_ref points to the CALLER's workflow file/ref, not the called reusable workflow — my previous design would have failed for every external caller. `job_workflow_sha` is the commit SHA of the running reusable workflow, which is exactly what we need. Bonus: SHA is more secure than textual ref, and removes the need for the `Resolve shared-workflows ref` step entirely (−18 lines). 2. [HIGH] Remove `|| true` from the RESOLVED pipeline. Silenced yq/jq failures would collapse into the "app not registered" warning path, hiding real manifest/query errors. Now fails fast on parse errors; empty RESOLVED from a successful query remains the legitimate "no matching clusters" case (handled explicitly below). 3. [MEDIUM] Rename config/deployment-matrix.yaml → .yml to match the repo convention (77 .yml files vs 2 .yaml). Updated all references: workflow input default, self-pr-validation gate, composite default, README docs, and the workflow doc. 4. [LOW] Add prominent migration callout to docs about deploy_in_* semantic change — apps must be in the manifest; inputs only subtract. Declined: per-cluster warning when deploy_in_<cluster>: true but app is absent from that cluster's manifest list. Inputs default to true, so this would fire for every app missing from any cluster on every run — noise without signal. Existing "app in zero clusters" warning already covers the actionable case.
actionlint v1.7.x (pinned via raven-actions/actionlint@v2.1.2) does not
yet include `github.job_workflow_sha` in its GitHub context schema,
triggering a false-positive "property not defined" error on the previous
direct reference.
Replace the inline `${{ github.job_workflow_sha }}` expression with an
intermediate step that reads the equivalent auto-set env var
GITHUB_JOB_WORKFLOW_SHA and exports it as a step output. Functionally
identical (the runner populates both from the same source) but the
`steps.X.outputs.Y` expression is recognized by actionlint.
Also adds a defensive guard that fails fast if GITHUB_JOB_WORKFLOW_SHA
is empty — which would mean the workflow is being called outside a
reusable-workflow context, catching that misconfiguration loudly.
…ssuming auto env var GITHUB_JOB_WORKFLOW_SHA is not exposed automatically by the runner. The github.job_workflow_sha context must be mapped explicitly through the step's env: block like any other context value. Prior implementation relied on a nonexistent auto env var and failed with 'is this job really running as part of a reusable workflow?' on every execution. Validated against real run: https://github.com/LerianStudio/plugin-br-pix-indirect-btg/actions/runs/24458387402/job/71466177318
Drops the 'Resolve reusable workflow SHA' step entirely — github.job_workflow_sha is empty when evaluated inside a job of a reusable workflow invoked via jobs.X.uses (empirically confirmed on run 24461037331). Three prior attempts to source that SHA all failed for different reasons: - parsing github.workflow_ref: points to the caller, not the reusable - GITHUB_JOB_WORKFLOW_SHA env var: does not exist - github.job_workflow_sha context: empty in this evaluation context This commit is a TEMP workaround for end-to-end validation: manifest checkout is hardcoded to the feature branch. Before merging #212 this will be replaced with a proper 'deployment_matrix_ref' input (default 'main').
… external action The LerianStudio/github-actions-argocd-sync action suppresses stderr via '> /dev/null 2>&1' on every CLI invocation. Any failure (auth, permission, network, malformed URL, expired token) is rendered indistinguishable from 'app does not exist' and skipped silently when skip-if-not-exists=true. Replaces the external action with inline argocd CLI calls that surface the real error output. Preserves the skip-if-not-exists semantics (warn + exit 0 on app get failure), but syncs fail the job loudly.
…to validate resolution Temporary change for end-to-end testing of the manifest-driven gitops pipeline on PR #212. Expected behavior on next beta of plugin-br-pix-indirect-btg: - resolve_clusters: {firmino, anacleto} (clotilde dropped) - values.yaml updated only in firmino/dev and anacleto/dev - argocd_sync fan-out: 2 jobs (firmino-*-dev, anacleto-*-dev) Revert this commit before merging #212.
…d restore matrix - Adds deployment_matrix_ref input (default 'main'). Callers on pinned tags get the latest manifest automatically; test runs can override via the input without editing the workflow. - Drops the temporary hardcoded ref to the feature branch. - Restores plugin-br-pix-indirect-btg in the clotilde cluster (removed temporarily during exclusion-validation test). End-to-end validation completed against plugin-br-pix-indirect-btg: - v1.5.2-beta.9: full fan-out to firmino + clotilde + anacleto, sync OK - v1.5.2-beta.10: manifest exclusion respected (firmino + anacleto only)
…anges - New 'deployment-matrix' label auto-applied by the labeler on PRs that touch config/deployment-matrix.yml. - config/deployment-matrix.yml added to self-release.yml paths-ignore: since callers resolve the manifest from main at runtime (via the deployment_matrix_ref input with default 'main'), matrix-only changes propagate to all callers without requiring a new release tag. - Mixed commits that touch the matrix plus workflow/action code still trigger a release as usual.
…ix-anacleto feat(gitops-update): manifest-driven topology + anacleto cluster
WalkthroughThis PR introduces a manifest-driven deployment matrix system to replace hard-coded cluster boolean flags in GitOps workflows. It adds configuration infrastructure (YAML schema, labels, linting, reporting) and updates the Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes The primary driver is the substantial logic refactor in Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
🔍 Lint Analysis
|
🛡️ CodeQL Analysis ResultsLanguages analyzed: Found 2 issue(s): 2 Medium
🔍 View full scan logs | 🛡️ Security tab |
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/gitops-update-workflow.md`:
- Line 37: Update the docs to document the inputs.deployment_matrix_ref
behavior: add deployment_matrix_ref to the "optional inputs" table, state its
default (when omitted the workflow will checkout the shared-workflows repo at
the same pinned ref as the workflow) and explain that passing
deployment_matrix_ref overrides that default to read the manifest from a
different ref; also clarify how this interacts with manifest-ref resolution in
.github/workflows/gitops-update.yml so callers understand when they must supply
deploy_in_* or deployment_matrix_ref.
In `@src/lint/deployment-matrix/action.yml`:
- Around line 12-19: The "Install dependencies" step currently falls back to
apt-get installing "python3-yaml" if python3 lacks yaml; update this step to
also detect pip3 (or pip) and use pip install pyyaml as a fallback when apt-get
isn't available: check for python3 -c "import yaml", then if not present try
apt-get install -y --no-install-recommends python3-yaml, and if that fails or
apt-get is not found use pip3 (or pip) to install pyyaml; adjust the shell block
under the step named "Install dependencies" to probe for pip executables and run
pip install pyyaml when appropriate.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7b8bab11-ef25-4088-a814-549b1a1ce796
📒 Files selected for processing (11)
.github/labeler.yml.github/labels.yml.github/workflows/gitops-update.yml.github/workflows/self-pr-validation.yml.github/workflows/self-release.ymlconfig/deployment-matrix.ymldocs/gitops-update-workflow.mdsrc/lint/deployment-matrix/README.mdsrc/lint/deployment-matrix/action.ymlsrc/notify/pr-lint-reporter/README.mdsrc/notify/pr-lint-reporter/action.yml
…fault resolution Addresses CodeRabbit feedback on PR #221. The workflow no longer checks out the manifest at the same ref as itself — it defaults to 'main' (via the deployment_matrix_ref input) so manifest updates propagate to every caller without bumping the pinned workflow tag. - Lead paragraph: replace 'same pinned ref' description. - Optional inputs table: add deployment_matrix_ref row. - 'How it works' step 2: rewrite to reflect the new behavior and rationale.
GitHub Actions Shared Workflows
Description
Type of Change
feat: New workflow or new input/output/step in an existing workflowfix: Bug fix in a workflow (incorrect behavior, broken step, wrong condition)perf: Performance improvement (e.g. caching, parallelism, reduced steps)refactor: Internal restructuring with no behavior changedocs: Documentation only (README, docs/, inline comments)ci: Changes to self-CI (workflows under.github/workflows/that run on this repo)chore: Dependency bumps, config updates, maintenancetest: Adding or updating testsBREAKING CHANGE: Callers must update their configuration after this PRBreaking Changes
None.
Testing
@developor the beta tagCaller repo / workflow run:
Related Issues
Closes #
Summary by CodeRabbit
New Features
anacleto).Documentation
Chores