[FIX] Remove VERSION arg from Dockerfile base stages to fix CI cache#1864
[FIX] Remove VERSION arg from Dockerfile base stages to fix CI cache#1864chandrasekharan-zipstack wants to merge 3 commits intomainfrom
Conversation
The ARG VERSION + LABEL version="${VERSION}" in the base stage caused
every Docker layer's cache key to depend on the version tag. Since the
tag changes every build (rc.245, rc.246, ...), BuildKit could never
match cached layers — resulting in 0% cache hit rate for all Python
services in CI.
The frontend (55% cached) was unaffected because its Dockerfile has no
ARG VERSION in any stage.
Removed from: backend, prompt-service, platform-service, runner,
x2text-service, worker-unified, tool-sidecar.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
WalkthroughRemoved the build-time ARG Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR fixes a Docker BuildKit cache invalidation bug affecting all 7 Python services in CI by moving Key changes:
Note:
Confidence Score: 5/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph BEFORE["❌ BEFORE — VERSION in base stage (0% cache hit)"]
A1["FROM python:3.12-slim AS base\nARG VERSION=dev\nLABEL version=VERSION"] -->|"VERSION changes → cache miss"| B1["apt-get install\nsystem deps"]
B1 -->|"cache miss"| C1["COPY uv / pyproject.toml"]
C1 -->|"cache miss"| D1["uv sync\n(slow, external deps)"]
D1 -->|"cache miss"| E1["COPY app code\nuv sync (final)"]
E1 -->|"cache miss"| F1["ENTRYPOINT"]
end
subgraph AFTER["✅ AFTER — VERSION only at final layer (100% cache hit)"]
A2["FROM python:3.12-slim AS base\nLABEL maintainer=..."] -->|"VERSION irrelevant → cache HIT"| B2["apt-get install\nsystem deps"]
B2 -->|"cache HIT"| C2["COPY uv / pyproject.toml"]
C2 -->|"cache HIT"| D2["uv sync\n(slow, external deps)"]
D2 -->|"cache HIT"| E2["COPY app code\nuv sync (final)"]
E2 -->|"VERSION changes → cache miss\n(cheap single layer)"| V2["ARG VERSION\nENV UNSTRACT_APPS_VERSION"]
V2 --> F2["ENTRYPOINT"]
end
Prompt To Fix All With AIThis is a comment left during a code review.
Path: docker/dockerfiles/backend.Dockerfile
Line: 82-83
Comment:
**`UNSTRACT_APPS_VERSION` env var set but never consumed**
`UNSTRACT_APPS_VERSION` is now baked into every image as a runtime environment variable, but a search of the entire codebase shows zero consumers of this variable — no Python code, Helm chart, or CI script reads it.
Setting an `ENV` here does still create a Docker layer (unlike `LABEL`), which means it will be rebuilt every time `VERSION` changes. The cache-fix goal is fully achieved by simply omitting the `ARG`/`ENV` entirely at the end. If the intent is to make the version introspectable at runtime (e.g. for a `/health` or `/version` endpoint), that's a good goal — but the consuming code hasn't been added yet.
The same applies to all seven Dockerfiles in this PR (`platform.Dockerfile`, `prompt.Dockerfile`, `runner.Dockerfile`, `tool-sidecar.Dockerfile`, `worker-unified.Dockerfile`, `x2text.Dockerfile`).
If you want to keep it for future use, consider adding a comment explaining its purpose so reviewers don't treat it as dead configuration.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: "[FEAT] Expose build ..." |
|
Note Unit test generation is a public access feature. Expect some limitations and changes as we gather feedback and continue to improve it. Generating unit tests... This may take up to 20 minutes. |
|
✅ Created PR with unit tests: #1865 |
Fix double period in backend health check log and remove ellipsis from prompt-service init log. Used to test Docker layer cache reuse after app code changes (ext-dependencies and nuitka-compile stages should remain cached). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add ARG VERSION + ENV UNSTRACT_APPS_VERSION at the end of the production stage in all Python Dockerfiles. This makes the build version available at runtime (os.environ["UNSTRACT_APPS_VERSION"]) without affecting layer caching — ARG is scoped to the production stage and placed after all expensive build steps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
| ARG VERSION=dev | ||
| ENV UNSTRACT_APPS_VERSION=${VERSION} |
There was a problem hiding this comment.
UNSTRACT_APPS_VERSION env var set but never consumed
UNSTRACT_APPS_VERSION is now baked into every image as a runtime environment variable, but a search of the entire codebase shows zero consumers of this variable — no Python code, Helm chart, or CI script reads it.
Setting an ENV here does still create a Docker layer (unlike LABEL), which means it will be rebuilt every time VERSION changes. The cache-fix goal is fully achieved by simply omitting the ARG/ENV entirely at the end. If the intent is to make the version introspectable at runtime (e.g. for a /health or /version endpoint), that's a good goal — but the consuming code hasn't been added yet.
The same applies to all seven Dockerfiles in this PR (platform.Dockerfile, prompt.Dockerfile, runner.Dockerfile, tool-sidecar.Dockerfile, worker-unified.Dockerfile, x2text.Dockerfile).
If you want to keep it for future use, consider adding a comment explaining its purpose so reviewers don't treat it as dead configuration.
Prompt To Fix With AI
This is a comment left during a code review.
Path: docker/dockerfiles/backend.Dockerfile
Line: 82-83
Comment:
**`UNSTRACT_APPS_VERSION` env var set but never consumed**
`UNSTRACT_APPS_VERSION` is now baked into every image as a runtime environment variable, but a search of the entire codebase shows zero consumers of this variable — no Python code, Helm chart, or CI script reads it.
Setting an `ENV` here does still create a Docker layer (unlike `LABEL`), which means it will be rebuilt every time `VERSION` changes. The cache-fix goal is fully achieved by simply omitting the `ARG`/`ENV` entirely at the end. If the intent is to make the version introspectable at runtime (e.g. for a `/health` or `/version` endpoint), that's a good goal — but the consuming code hasn't been added yet.
The same applies to all seven Dockerfiles in this PR (`platform.Dockerfile`, `prompt.Dockerfile`, `runner.Dockerfile`, `tool-sidecar.Dockerfile`, `worker-unified.Dockerfile`, `x2text.Dockerfile`).
If you want to keep it for future use, consider adding a comment explaining its purpose so reviewers don't treat it as dead configuration.
How can I resolve this? If you propose a fix, please make it concise.
|



What
Remove
ARG VERSIONandLABEL version="${VERSION}"from thebasestage of all 7 Python Dockerfiles.Why
The
ARG VERSIONused inLABEL version="${VERSION}"made every Docker layer's cache key depend on the version tag. Since the tag changes every build (rc.245, rc.246, ...), BuildKit could never match cached layers — 0% cache hit rate for all Python services in CI.The frontend (55% cached) was unaffected because its Dockerfile has no
ARG VERSIONin any build stage.Evidence from build run #23239022262:
67550562986): cache manifest loads successfully but zero layers match67550562697): cache manifest loads and multiple layers hitCACHEDHow
Removed
ARG VERSION=devand theversion="${VERSION}"from theLABELinstruction in thebasestage of all Python Dockerfiles. The version label was unused — nothing in Helm charts, CI, or Kubernetes reads it. The image tag itself (e.g.,backend:rc.246) carries the version.Affected: backend, prompt-service, platform-service, runner, x2text-service, worker-unified, tool-sidecar.
Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
No. The
versionlabel was purely decorative metadata on the Docker image. No code, Helm chart, or deployment reads it. The image tag already carries the version. Local test confirmed: building withVERSION=cache-test-1thenVERSION=cache-test-2produces 100% cache hit (0.0s build).Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
Screenshots
N/A
Checklist
I have read and understood the Contribution Guidelines.