Add Newton-only Dockerfile for headless training without Isaac Sim#5150
Add Newton-only Dockerfile for headless training without Isaac Sim#5150Kripner wants to merge 1 commit intoisaac-sim:developfrom
Conversation
Add `docker/Dockerfile.newton` for building a lightweight container with Isaac Lab + Newton physics backend. No Omniverse or Isaac Sim required - suitable for headless RL training on HPC/SLURM clusters. - CUDA 12.8 base image with Python 3.12 - PyTorch 2.10.0+cu128 - Full Isaac Lab install via `isaaclab.sh -i` - Singularity/Apptainer compatible (NVIDIA binary placeholders, writable cache directories) - Build-time import verification
Greptile SummaryThis PR adds The overall approach is sound and consistent with the motivation (kit-less training for clusters), but there are several issues relative to the conventions established by
Confidence Score: 2/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[nvidia/cuda:12.8.1-devel-ubuntu22.04] --> B[apt-get: Python 3.12 + build deps]
B --> C[python3.12 -m venv /opt/isaaclab-venv]
C --> D[pip install torch==2.10.0+cu128\ntorchvision==0.25.0+cu128]
D --> E[COPY . /workspace/isaaclab]
E --> F[isaaclab.sh -i\ninstalls all submodules + RL frameworks\nalso re-runs _ensure_cuda_torch — skipped if version matches]
F --> G[Singularity compat stubs\nnvidia-smi, localtime, etc.]
G --> H[Build-time verification\nimport isaaclab_newton, newton, warp, torch]
H --> I[WORKDIR /workspace/isaaclab]
style D fill:#ffd700,color:#000
style F fill:#ffd700,color:#000
style E fill:#ff8c8c,color:#000
Reviews (1): Last reviewed commit: "Add Newton-only Dockerfile for headless ..." | Re-trigger Greptile |
| COPY . ${ISAACLAB_PATH} | ||
| RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh |
There was a problem hiding this comment.
Missing Windows line-ending fix for shell scripts
All other Dockerfiles in this repo (Dockerfile.base, Dockerfile.curobo) include a step to strip Windows-style \r carriage returns from .sh files immediately after the COPY step:
RUN find ${ISAACLAB_PATH} -type f -name "*.sh" -exec sed -i 's/\r$//' {} +If this image is built on Windows (Docker Desktop / WSL), isaaclab.sh may contain \r\n line endings, causing the build to fail at the RUN ${ISAACLAB_PATH}/isaaclab.sh -i step with:
bash: /workspace/isaaclab/isaaclab.sh: /usr/bin/env bash^M: bad interpreter: No such file or directory
The fix should be added between the COPY and chmod steps, consistent with the other Dockerfiles.
| COPY . ${ISAACLAB_PATH} | ||
| RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh | ||
|
|
||
| # ---- Install Isaac Lab packages via isaaclab.sh ---- | ||
| # This installs all extensions including Newton, RL frameworks, etc. | ||
| RUN ${ISAACLAB_PATH}/isaaclab.sh -i |
There was a problem hiding this comment.
COPY . before heavy install step defeats layer caching
COPY . ${ISAACLAB_PATH} copies the entire repository in a single step, directly before the expensive isaaclab.sh -i installation layer (~2 GB+). Any change to any file in the repo (including scripts, tests, or docs) will invalidate that layer and force a full re-install from scratch.
The other Dockerfiles (Dockerfile.base, Dockerfile.curobo) use a selective, two-stage copy to separate rarely-changing config from frequently-changing source code:
# Stage 1: copy only files that drive the install (rarely change)
COPY isaaclab.* environment.yml pyproject.toml ${ISAACLAB_PATH}/
COPY tools/ ${ISAACLAB_PATH}/tools/
COPY source/ ${ISAACLAB_PATH}/source/
RUN find ${ISAACLAB_PATH} -type f -name "*.sh" -exec sed -i 's/\r$//' {} +
RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh
RUN ${ISAACLAB_PATH}/isaaclab.sh -i
# Stage 2: copy the rest (frequently changes, but cheap to redo)
COPY . ${ISAACLAB_PATH}This pattern ensures that iterating on scripts, tests, or documentation does not require re-downloading all Python dependencies.
| # ---- Python virtual environment ---- | ||
| RUN python3.12 -m venv /opt/isaaclab-venv | ||
| ENV PATH="/opt/isaaclab-venv/bin:$PATH" |
There was a problem hiding this comment.
VIRTUAL_ENV not exported alongside PATH
The venv is activated by prepending its bin/ to PATH, but the VIRTUAL_ENV environment variable is never set. Many Python tooling components — including the command_install logging in isaaclab/cli/commands/install.py and pip itself — check VIRTUAL_ENV to confirm they are operating inside a virtual environment. Without it, tools that inspect this variable will not detect the venv, which can lead to unexpected behaviour (e.g., packages being installed into the wrong location if PATH resolution is ever disrupted).
| # ---- Python virtual environment ---- | |
| RUN python3.12 -m venv /opt/isaaclab-venv | |
| ENV PATH="/opt/isaaclab-venv/bin:$PATH" | |
| RUN python3.12 -m venv /opt/isaaclab-venv | |
| ENV VIRTUAL_ENV=/opt/isaaclab-venv | |
| ENV PATH="/opt/isaaclab-venv/bin:$PATH" |
| apt -y autoremove && apt clean autoclean && \ | ||
| rm -rf /var/lib/apt/lists/* |
There was a problem hiding this comment.
apt used instead of apt-get in non-interactive cleanup
apt is intended for interactive terminal use and may produce warnings (or behave differently) in non-interactive scripts. Docker best-practice is to use apt-get, which has a stable, scriptable CLI. All other Dockerfiles in this repo use apt-get exclusively.
Additionally, apt clean autoclean is not valid apt syntax — apt accepts one subcommand at a time, so autoclean is silently ignored here. The correct invocation uses apt-get:
| apt -y autoremove && apt clean autoclean && \ | |
| rm -rf /var/lib/apt/lists/* | |
| apt-get -y autoremove && apt-get clean && apt-get autoclean && \ |
| # ---- Build-time verification ---- | ||
| RUN python -c "\ | ||
| import isaaclab; \ | ||
| import isaaclab_newton; \ | ||
| import newton; \ | ||
| import warp; \ | ||
| import torch; \ | ||
| print(f'torch {torch.__version__}, CUDA available: {torch.cuda.is_available()}'); \ | ||
| print('All Newton imports OK')" |
There was a problem hiding this comment.
Build-time CUDA check will always report CUDA as unavailable
torch.cuda.is_available() requires access to a GPU at build time. Docker build does not expose GPUs by default (even with --gpus all support, CUDA binaries are not present in the build layer at this stage). This means the print statement will always output:
torch.cuda.is_available(): False
This does not fail the build (since there's no assertion), but it will mislead users scanning build logs into thinking the CUDA setup is incorrect. Consider removing the CUDA availability check from the build-time verification, or adding a comment clarifying that False is expected here and GPU availability is only confirmed at container runtime.
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR adds docker/Dockerfile.newton — a standalone, lightweight container for headless RL training using Isaac Lab's Newton physics backend without Isaac Sim/Omniverse. It targets CUDA 12.8, Python 3.12, PyTorch 2.10.0, and is designed for Singularity/Apptainer compatibility on HPC/SLURM clusters.
Design Assessment
Is a separate Dockerfile the right approach? Mostly yes, but with caveats.
The existing Dockerfile.base and Dockerfile.curobo are tightly coupled to Isaac Sim — they derive from nvcr.io/nvidia/isaac-sim and assume the Isaac Sim runtime is present. A Newton-only image without Isaac Sim requires a fundamentally different base (nvidia/cuda), different Python setup (venv instead of Isaac Sim's bundled Python), and no _isaac_sim symlink. This justifies a separate Dockerfile rather than a build-arg in Dockerfile.base.
However, the current PR adds the Dockerfile in isolation without integrating it into the existing Docker infrastructure (docker-compose.yaml, container.py tooling). This limits discoverability and means users cannot use the standard container.py build newton workflow.
Architecture Impact
- Low blast radius: Single new file, no modifications to existing code.
- Infrastructure gap: Not wired into
docker-compose.yamlorcontainer.py. Users who discover the Newton image through the existing Docker docs/tooling won't find it. - Documentation gap: No updates to
docs/source/deployment/docker.rstor the Newton installation docs to reference this Dockerfile. - No CI validation: There's no workflow to build-test this Dockerfile, so it can silently break on dependency updates.
Implementation Verdict
Minor fixes needed — The Dockerfile itself is well-structured and functional, but has several issues that should be addressed before merging.
Test Coverage
No CI builds or tests for this Dockerfile. The existing docker/test/test_docker.py does not appear to cover Newton images. The build-time python -c "import ..." verification is a good smoke test but only validates import, not actual simulation.
CI Status
- ✅
labeler: passed - ⏸️
pre-commit,license-check,build,docs,check-links: awaiting approval (first-time contributor;author_association: NONE) - Mergeable: ✅ Yes
Findings
🔴 High: Copyright year mismatch
Line 1: 2022-2026 — all existing Dockerfiles on develop use 2022-2025. This will likely fail the license/copyright check CI once it's approved to run. Align with the existing convention or confirm the year range has been officially bumped repo-wide.
🟡 Medium: No docker-compose.yaml / container.py integration
The standard Isaac Lab Docker workflow uses python docker/container.py build <profile>. This Dockerfile exists outside that system — there's no .env.newton, no service definition in docker-compose.yaml, and container.py won't recognize newton as a profile. Consider adding a docker-compose.newton.yaml overlay or at minimum a newton service in the main compose file.
🟡 Medium: COPY . cache invalidation
Line 71: COPY . ${ISAACLAB_PATH} copies the entire repo in a single layer. Compare with Dockerfile.base which does a two-stage COPY — first copying only isaaclab.sh, pyproject.toml, tools/, and source/ (for dependency installation), then copying the rest at the very end. This means any file change in the repo (README, docs, scripts) invalidates the Isaac Lab install layer, forcing a full isaaclab.sh -i rebuild (~10+ minutes). Adopt the two-stage COPY pattern:
COPY isaaclab.* environment.yml pyproject.toml ${ISAACLAB_PATH}/
COPY tools/ ${ISAACLAB_PATH}/tools/
COPY source/ ${ISAACLAB_PATH}/source/
RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh
RUN ${ISAACLAB_PATH}/isaaclab.sh -i
# ... then later:
COPY . ${ISAACLAB_PATH}🟡 Medium: Missing --install / -i flag consistency
Line 76: Uses isaaclab.sh -i which is correct and equivalent to --install. However, Dockerfile.base uses isaaclab.sh --install. For consistency across Dockerfiles, consider using the long form --install.
🟡 Medium: Missing find ... -exec sed -i 's/\r$//' for line endings
Dockerfile.base (line ~55) runs find ${ISAACLAB_PATH} -type f -name "*.sh" -exec sed -i 's/\r$//' {} + to fix Windows line endings before chmod +x. The Newton Dockerfile skips this. If a contributor on Windows builds this, isaaclab.sh may fail with \r in shebang. Add the same fix.
🟡 Medium: No --mount=type=cache for pip
Line 76: Dockerfile.base uses RUN --mount=type=cache,target=${DOCKER_USER_HOME}/.cache/pip for the install step. The Newton Dockerfile uses --no-cache-dir everywhere and has no pip cache mount. This is fine for single builds but significantly slower for iterative development. Consider adding the cache mount for the isaaclab.sh -i step.
🟢 Low: No ENTRYPOINT or CMD
The Dockerfile has no ENTRYPOINT or CMD. While this is fine for docker run ... bash -c "...", adding a sensible default like CMD ["bash"] (matching Dockerfile.base's compose entrypoint) improves usability.
🟢 Low: Missing apt-get clean in the apt layer
Line 49-51: Uses apt -y autoremove && apt clean autoclean but the convention in Dockerfile.base is apt-get -y autoremove && apt-get clean. Using apt directly triggers a warning about CLI instability. Minor, but worth aligning.
🟢 Low: libgl1-mesa-glx may be unnecessary
Line 45: For a headless training container that explicitly doesn't do rendering, libgl1-mesa-glx (OpenGL) and libusb-1.0-0 (USB device access) seem unnecessary. If Newton truly runs headless without any GL context, removing these reduces image size. Verify whether any transitive dependency (e.g., OpenCV imported by torchvision) actually needs them.
🟢 Low: Hardcoded PyTorch versions
Lines 62-65: PyTorch 2.10.0+cu128 and torchvision 0.25.0+cu128 are hardcoded. The install docs mention torch==2.10.0 (without +cu128 suffix) and use --index-url to resolve the CUDA variant. The +cu128 explicit suffix works but diverges from the documented pattern. Consider using ARG for version pinning to allow easy overrides.
ℹ️ Info: TZ=$(date +%Z) bakes build-time timezone
Line 99: echo "export TZ=$(date +%Z)" captures the timezone of the build host, not the runtime host. This is inherited from Dockerfile.base so it's a pre-existing pattern, but worth noting that containers built in different timezones will have different TZ values baked in.
Overall: This is a good, well-motivated addition. The Dockerfile is functional and the Singularity compatibility section shows the author understands the HPC use case. The main asks are: (1) fix the copyright year, (2) adopt the two-stage COPY for cache efficiency, (3) add the line-ending fix, and (4) consider docker-compose integration. Everything else is polish.
| @@ -0,0 +1,104 @@ | |||
| # Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md). | |||
There was a problem hiding this comment.
🔴 Copyright year mismatch: All existing Dockerfiles on develop use 2022-2025. This file has 2022-2026. This will likely fail the license/copyright CI check once approved. Please align with the repo convention.
| COPY . ${ISAACLAB_PATH} | ||
| RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh | ||
|
|
||
| # ---- Install Isaac Lab packages via isaaclab.sh ---- |
There was a problem hiding this comment.
🟡 Cache invalidation: This copies the entire repo in one shot. Dockerfile.base uses a two-stage pattern — first copying only files needed for dependency installation (isaaclab.*, pyproject.toml, tools/, source/), running isaaclab.sh --install, then copying the rest. With the current single COPY, any change to any file (README, docs, scripts) busts the cache and forces a full isaaclab.sh -i rebuild.
Suggested:
COPY isaaclab.* environment.yml pyproject.toml ${ISAACLAB_PATH}/
COPY tools/ ${ISAACLAB_PATH}/tools/
COPY source/ ${ISAACLAB_PATH}/source/
RUN find ${ISAACLAB_PATH} -type f -name "*.sh" -exec sed -i 's/\r$//' {} +
RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh
RUN ${ISAACLAB_PATH}/isaaclab.sh -i
# Copy remaining files (scripts, docs, etc.)
COPY . ${ISAACLAB_PATH}| RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh | ||
|
|
||
| # ---- Install Isaac Lab packages via isaaclab.sh ---- | ||
| # This installs all extensions including Newton, RL frameworks, etc. |
There was a problem hiding this comment.
🟡 Missing Windows line-ending fix: Dockerfile.base runs find ${ISAACLAB_PATH} -type f -name "*.sh" -exec sed -i 's/\r$//' {} + before chmod +x. Without this, Windows contributors who have core.autocrlf=true will get a broken isaaclab.sh (bash will choke on \r in the shebang). Add the same line-ending fix before the chmod.
| RUN ${ISAACLAB_PATH}/isaaclab.sh -i | ||
|
|
||
| # ---- Singularity/Apptainer compatibility ---- | ||
| RUN touch /bin/nvidia-smi && \ |
There was a problem hiding this comment.
🟡 Missing pip cache mount: Dockerfile.base uses RUN --mount=type=cache,target=/root/.cache/pip for the install step. Consider:
RUN --mount=type=cache,target=/root/.cache/pip ${ISAACLAB_PATH}/isaaclab.sh -iAlso, for consistency with Dockerfile.base, prefer --install (long form) over -i.
| libgl1-mesa-glx \ | ||
| libusb-1.0-0 \ | ||
| ncurses-term \ | ||
| wget && \ |
There was a problem hiding this comment.
🟢 Minor: apt -y autoremove && apt clean autoclean triggers a stability warning from apt. Use apt-get -y autoremove && apt-get clean && rm -rf /var/lib/apt/lists/* to match Dockerfile.base convention. (The rm -rf is already partially here but combined with autoclean which isn't needed with apt-get clean.)
| python3.12-venv \ | ||
| build-essential \ | ||
| cmake \ | ||
| git \ |
There was a problem hiding this comment.
🟢 Consider: For a headless training container, do you actually need libgl1-mesa-glx (OpenGL) and libusb-1.0-0 (USB)? These add image size for what's meant to be a lightweight container. If they're transitive deps (e.g., OpenCV via torchvision), keep them but add a comment explaining why.
Summary
docker/Dockerfile.newton- a lightweight container for Isaac Lab + Newton physics (no Omniverse/Isaac Sim)isaaclab.sh -iinstallMotivation
The Newton physics integration enables kit-less Isaac Lab, but there's no container for it yet. This Dockerfile gives users a one-command path to a working training environment: