Add multi-architecture support for Docker images and workflows#841
Add multi-architecture support for Docker images and workflows#841stearz wants to merge 1 commit intokelos-dev:mainfrom
Conversation
There was a problem hiding this comment.
1 issue found across 9 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="Makefile">
<violation number="1" location="Makefile:108">
P2: `image-multiarch` hardcodes prebuilt arches (`amd64 arm64`) while `--platform` uses configurable `IMAGE_PLATFORMS`, which can produce missing per-arch binaries when platforms are overridden.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|
This is a duplicate of #787 |
|
It would be cool if ypu merge one of them. 😉 |
|
Hopefully 6bd4892 addresses the requested changes |
|
will review it this weekend. |
.github/workflows/release.yaml
Outdated
| make push VERSION=latest | ||
| env: | ||
| VERSION: ${{ needs.determine-version.outputs.version }} | ||
| run: make manifest IMAGE_PLATFORMS=linux/amd64,linux/arm64 VERSION=latest |
There was a problem hiding this comment.
P1: On tag pushes, the matrix build publishes ${VERSION}-amd64 and ${VERSION}-arm64, but this step assembles :latest from latest-amd64 and latest-arm64. Those source tags are never built in this workflow run, so a fresh release can fail here or repoint :latest at stale per-arch images from an older run. Please build and push latest-$arch before this step, or assemble latest from the versioned arch tags created above.
Makefile
Outdated
| REGISTRY ?= ghcr.io/kelos-dev | ||
| VERSION ?= latest | ||
| IMAGE_DIRS ?= cmd/kelos-controller cmd/kelos-spawner cmd/kelos-token-refresher cmd/kelos-webhook-server cmd/ghproxy claude-code codex gemini opencode cursor | ||
| IMAGE_DIRS ?= cmd/kelos-controller cmd/kelos-spawner cmd/kelos-token-refresher claude-code codex gemini opencode cursor |
There was a problem hiding this comment.
P1: Removing cmd/ghproxy and cmd/kelos-webhook-server from the default IMAGE_DIRS breaks existing callers of make image. The e2e workflow still runs make image VERSION=e2e and then loads ghcr.io/kelos-dev/ghproxy:e2e, so CI will fail because that tag is no longer built. It also stops publishing versioned kelos-webhook-server images even though the chart still references that image when the webhook server is enabled.
There was a problem hiding this comment.
oops, forgot to get changes from upstream. Fixed in e6d86de
Makefile
Outdated
| done | ||
|
|
||
| .PHONY: manifest | ||
| manifest: ## Create and push multi-arch manifest from per-arch images (use WHAT, IMAGE_PLATFORMS). |
There was a problem hiding this comment.
Cleanup: after this refactor the release flow uses make image ... PUSH=true plus make manifest, and I could not find any remaining in-tree callers of make push. Keeping the old target around leaves a dead interface in the Makefile, so please remove push in the same cleanup.
gjkim42
left a comment
There was a problem hiding this comment.
The SOURCE_VERSION fix and make push removal look good. One build-breaking issue remains with the Dockerfiles that weren't updated.
| REGISTRY ?= ghcr.io/kelos-dev | ||
| VERSION ?= latest | ||
| IMAGE_DIRS ?= cmd/kelos-controller cmd/kelos-spawner cmd/kelos-token-refresher cmd/kelos-webhook-server cmd/ghproxy claude-code codex gemini opencode cursor | ||
| IMAGE_DIRS ?= cmd/kelos-controller cmd/kelos-spawner cmd/kelos-token-refresher cmd/ghproxy cmd/kelos-webhook-server claude-code codex gemini opencode cursor |
There was a problem hiding this comment.
Bug: cmd/ghproxy and cmd/kelos-webhook-server are back in IMAGE_DIRS, but their Dockerfiles weren't updated for multi-arch. The image build loop (line 89) renames binaries to bin/$name-linux-$arch, so bin/ghproxy no longer exists — only bin/ghproxy-linux-amd64. Their Dockerfiles still COPY bin/ghproxy . and COPY bin/kelos-webhook-server ., which will fail.
Needs the same ARG TARGETARCH + renamed COPY treatment as the other Dockerfiles:
ARG TARGETARCH
COPY bin/ghproxy-linux-${TARGETARCH} ghproxy| .PHONY: image | ||
| image: ## Build docker images (use WHAT to build specific image). | ||
| image: ## Build docker images (use WHAT, IMAGE_PLATFORMS, PUSH=true to customize). | ||
| @for dir in $(filter cmd/%,$(or $(WHAT),$(IMAGE_DIRS))); do \ |
There was a problem hiding this comment.
Nit: cmd/kelos-token-refresher matches cmd/% here, so it gets pre-built and renamed to bin/kelos-token-refresher-linux-$arch. But its Dockerfile is a multi-stage build that rebuilds from source inside Docker — these pre-built binaries are never used. Not a correctness issue (the Docker build still works), but it wastes CI time. Consider either updating its Dockerfile to match the others, or excluding it from this loop.
What type of PR is this?
/kind feature
What this PR does / why we need it:
adds support for arm64
Which issue(s) this PR is related to:
Fixes #840
Special notes for your reviewer:
none
Does this PR introduce a user-facing change?
yes, containers now also run on arm64 machines
Summary by cubic
Add multi-architecture Docker builds (linux/amd64, linux/arm64) and update release workflows to push per-arch images and create multi-arch manifests, so images run on arm64 too (Fixes #840).
Makefile adds
IMAGE_PLATFORMS,PUSH, and amanifesttarget; Dockerfiles now copy arch-specific binaries usingTARGETARCH, and tag releases publish alatestmulti-arch manifest.Written for commit e6d86de. Summary will update on new commits.