Skip to content

fix: separate ServiceAccount for router workloads#433

Open
ambient-code[bot] wants to merge 4 commits intomainfrom
fix/351-separate-service-accounts
Open

fix: separate ServiceAccount for router workloads#433
ambient-code[bot] wants to merge 4 commits intomainfrom
fix/351-separate-service-accounts

Conversation

@ambient-code
Copy link
Copy Markdown
Contributor

@ambient-code ambient-code bot commented Apr 8, 2026

Summary

  • Creates a dedicated router-sa ServiceAccount with minimal RBAC permissions for router workloads in the operator deployment path only
  • The controller keeps its existing controller-manager SA with full RBAC (secrets CRUD, CRD access, leader election)
  • Note: Helm chart changes have been reverted per review feedback (Helm charts are deprecated). Only the operator Go code includes the separate ServiceAccount logic.

Fixes #351

Changes

Operator (only deployment path modified)

  • rbac.go: Added createRouterServiceAccount() and createRouterRole() with reconciliation logic
  • jumpstarter_controller.go: Updated createRouterDeployment() to use {name}-router-sa
  • Router Role grants minimal permissions: get/list/watch on configmaps only (no secrets access)
  • Router RoleBinding binds the router Role to the router ServiceAccount

Helm chart

  • Reverted all changes - Helm charts remain unchanged per review feedback

Review feedback addressed

  • Removed Helm chart modifications (commit 8bc4c6d) - Helm charts are deprecated
  • Removed unnecessary secrets permission from router Role (commit 8de3816) - router only reads ConfigMaps

Test plan

  • CI checks passing (all e2e tests, deploy-kind for both operator and helm paths)
  • Operator path creates separate router-sa with minimal RBAC
  • Helm path unchanged, still uses controller-manager SA

🤖 Generated with Claude Code

The controller and router previously shared the same `controller-manager`
ServiceAccount, giving the router unnecessary cluster-wide secrets CRUD
access. This creates a dedicated `router-sa` ServiceAccount with no RBAC
bindings and `automountServiceAccountToken: false`, following the
principle of least privilege.

Fixes #351

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 8, 2026

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3ae7064c-021c-40ea-842a-33dcbcf63ca4

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/351-separate-service-accounts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@netlify
Copy link
Copy Markdown

netlify bot commented Apr 8, 2026

Deploy Preview for jumpstarter-docs ready!

Name Link
🔨 Latest commit 8de3816
🔍 Latest deploy log https://app.netlify.com/projects/jumpstarter-docs/deploys/69d68fb36efdb500083de8f7
😎 Deploy Preview https://deploy-preview-433--jumpstarter-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

kind: Role
metadata:
labels:
app.kubernetes.io/name: jumpstarter-router
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not modify any helm, it has been deprecated.

Helm charts are deprecated; only the operator deployment path
should be modified for the separate router ServiceAccount.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ambient-code
Copy link
Copy Markdown
Contributor Author

ambient-code bot commented Apr 8, 2026

Status Update

Review feedback addressed

  • @mangelajo's comment (Helm charts are deprecated): Reverted all 6 Helm chart template changes in commit 8bc4c6d. Only the operator deployment path (rbac.go, jumpstarter_controller.go, config/rbac/role.yaml) retains the separate router-sa ServiceAccount logic.

CI failures analysis

The following CI jobs failed:

  • deploy-kind (helm): Router pod failed to connect on port 8083 (timed out after 120s). This was likely caused by the Helm chart referencing the new router-sa ServiceAccount which may not have been created correctly. The revert of Helm changes should fix this.
  • deploy-kind (operator): Operator e2e test timed out at 600s waiting for a condition — needs re-run to confirm whether the operator-side changes work correctly after the Helm revert.
  • e2e-tests (ubuntu-24.04, ubuntu-24.04-arm): Router pod jumpstarter-router-0 timed out waiting for ready condition — same root cause as the Helm deploy failure.
  • e2e-compat-old-client: Likely cascading from the same router startup issue.

The Helm chart revert should resolve the helm-path failures. The operator-path changes (separate router-sa with automountServiceAccountToken: false) remain intact and should be validated on the next CI run.

The router process needs Kubernetes API access at startup to load its
configuration from a ConfigMap (via ctrl.GetConfigOrDie() and
LoadRouterConfiguration). Setting AutomountServiceAccountToken: false on
both the ServiceAccount and pod spec prevented the router from
authenticating, causing the pod to crash and never become ready (180s
timeout in CI).

Changes:
- Remove AutomountServiceAccountToken: false from router ServiceAccount
  and pod spec so the token is mounted
- Add a minimal router Role granting read-only access to configmaps and
  secrets (the only resources the router needs)
- Add a RoleBinding to bind the router Role to the router ServiceAccount

This maintains the security goal of separating the router SA from the
controller SA while granting only the minimum permissions needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ambient-code
Copy link
Copy Markdown
Contributor Author

ambient-code bot commented Apr 8, 2026

Root cause of CI failure

The router pod (jumpstarter-router-0) was timing out because it could not access the Kubernetes API at startup.

The router's main.go calls ctrl.GetConfigOrDie() and then LoadRouterConfiguration(), which reads the jumpstarter-controller ConfigMap from the K8s API. With AutomountServiceAccountToken: false set on both the ServiceAccount and pod spec, no service account token was mounted into the pod, so the in-cluster config had no credentials and the router crashed immediately on startup.

Fix

  1. Removed AutomountServiceAccountToken: false from both the router ServiceAccount definition and the router pod spec, so the SA token is mounted and the router can authenticate to the K8s API.

  2. Added a minimal RBAC Role ({name}-router-role) granting read-only access to configmaps and secrets — the only resources the router needs to read its configuration and secret keys.

  3. Added a RoleBinding ({name}-router-rolebinding) binding the router Role to the {name}-router-sa ServiceAccount.

This maintains the security goal of separating the router SA from the controller SA (the router SA has far fewer permissions than the controller SA) while providing the minimum permissions the router actually needs.

@ambient-code
Copy link
Copy Markdown
Contributor Author

ambient-code bot commented Apr 8, 2026

CI Fix: Router ServiceAccount needs K8s API access

Root cause: The router process calls ctrl.GetConfigOrDie() and LoadRouterConfiguration() at startup, which reads the jumpstarter-controller ConfigMap from the Kubernetes API. Setting AutomountServiceAccountToken: false on both the ServiceAccount and pod spec prevented the router from authenticating to the K8s API, causing it to crash immediately (180s ready timeout in CI).

Fix (commit 93ce272):

  • Removed AutomountServiceAccountToken: false from both the router ServiceAccount and pod spec
  • Added a minimal Role granting read-only access to configmaps and secrets (the resources the router actually needs)
  • Added a RoleBinding to bind the new Role to the router ServiceAccount
  • Added reconciliation logic for the new Role and RoleBinding

This maintains the security separation goal — the router SA has far fewer permissions than the controller SA — while granting the minimum permissions the router actually needs to function.

Comment on lines +378 to +382
APIGroups: []string{""},
Resources: []string{"secrets"},
Verbs: []string{"get", "list", "watch"},
},
},
Copy link
Copy Markdown
Member

@mangelajo mangelajo Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the router really need to get/list/watch secrets? why?

Resources: []string{"secrets"},
Verbs: []string{"get", "list", "watch"},
},
},
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — you're right, it doesn't. Looking at LoadRouterConfiguration in controller/internal/config/config.go, the router only calls client.Get() on a ConfigMap (jumpstarter-controller), and LoadGrpcConfiguration just parses the in-memory config struct without any K8s API calls. The router never reads secrets.

I'll remove the secrets permission from the router Role so it only has get/list/watch on configmaps.

The router only reads a ConfigMap via LoadRouterConfiguration() and does
not access any secrets. Remove the secrets PolicyRule from the router
Role per review feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ambient-code
Copy link
Copy Markdown
Contributor Author

ambient-code bot commented Apr 9, 2026

Review Status Update

All review feedback has been addressed:

  1. Helm chart modifications reverted (per @mangelajo's feedback that Helm charts are deprecated)

    • Commit 8bc4c6d reverted all 6 Helm template changes
    • Helm deployment path remains unchanged and uses controller-manager SA
  2. Router secrets permission removed (per @mangelajo's question about why router needs secrets access)

    • Commit 8de3816 removed secrets PolicyRule from router Role
    • Router now only has get/list/watch on configmaps - the minimum it needs to read jumpstarter-controller ConfigMap
  3. All CI checks passing

    • e2e tests (ubuntu-24.04, ubuntu-24.04-arm): pass
    • deploy-kind (operator, helm): pass
    • e2e-compat tests: pass
    • All other checks: pass

The PR is ready for review. The operator deployment path now creates a separate router-sa ServiceAccount with minimal RBAC permissions (read-only access to configmaps), maintaining security separation between router and controller workloads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shared ServiceAccount across 4 workloads with cluster-wide secrets CRUD

1 participant