Skip to content

Add Newton-only Dockerfile for headless training without Isaac Sim#5150

Open
Kripner wants to merge 1 commit intoisaac-sim:developfrom
Kripner:add/newton-dockerfile
Open

Add Newton-only Dockerfile for headless training without Isaac Sim#5150
Kripner wants to merge 1 commit intoisaac-sim:developfrom
Kripner:add/newton-dockerfile

Conversation

@Kripner
Copy link
Copy Markdown

@Kripner Kripner commented Apr 2, 2026

Summary

  • Add docker/Dockerfile.newton - a lightweight container for Isaac Lab + Newton physics (no Omniverse/Isaac Sim)
  • CUDA 12.8 base, Python 3.12, PyTorch 2.10.0, full isaaclab.sh -i install
  • Singularity/Apptainer compatible for HPC/SLURM clusters

Motivation

The Newton physics integration enables kit-less Isaac Lab, but there's no container for it yet. This Dockerfile gives users a one-command path to a working training environment:

docker build -t isaac-lab-newton -f docker/Dockerfile.newton .
docker run --rm --gpus all isaac-lab-newton bash -c \
  "python scripts/reinforcement_learning/rsl_rl/train.py \
   --task Isaac-Reach-Franka-v0 --num_envs 4096 --headless \
   env.sim.physics=newton"

Add `docker/Dockerfile.newton` for building a lightweight container
with Isaac Lab + Newton physics backend. No Omniverse or Isaac Sim
required - suitable for headless RL training on HPC/SLURM clusters.

- CUDA 12.8 base image with Python 3.12
- PyTorch 2.10.0+cu128
- Full Isaac Lab install via `isaaclab.sh -i`
- Singularity/Apptainer compatible (NVIDIA binary placeholders,
  writable cache directories)
- Build-time import verification
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 2, 2026

Greptile Summary

This PR adds docker/Dockerfile.newton, a new lightweight container image targeting headless HPC/SLURM training with Newton physics, without the full Omniverse/Isaac Sim stack. The Dockerfile uses a CUDA 12.8 Ubuntu 22.04 base, installs Python 3.12 from the deadsnakes PPA into a virtual environment, pre-installs PyTorch 2.10.0+cu128, and then delegates the full Isaac Lab installation to isaaclab.sh -i.

The overall approach is sound and consistent with the motivation (kit-less training for clusters), but there are several issues relative to the conventions established by Dockerfile.base and Dockerfile.curobo:

  • Missing Windows line-ending fix (P1): The find ... sed -i 's/\r$//' step present in all other Dockerfiles is absent. On Windows-built contexts, isaaclab.sh will have \r\n endings and the build will fail with a "bad interpreter" error.
  • COPY . before heavy install defeats layer caching (P2): All other Dockerfiles stage the copy to separate install-driving files from the rest of the source tree, so iterating on code doesn't re-download ~2 GB of dependencies. This Dockerfile copies everything upfront.
  • VIRTUAL_ENV not set (P2): The virtual environment is activated only via PATH, not via the VIRTUAL_ENV env var, which tooling (including the isaaclab CLI and pip) relies on to confirm venv context.
  • Non-standard apt usage in cleanup (P2): Line 51 uses apt (interactive) instead of apt-get (scripting), and apt clean autoclean is not valid apt syntax — autoclean is silently dropped.
  • Misleading build-time CUDA check (P2): torch.cuda.is_available() will always print False at build time since GPUs are not available during docker build, which will confuse users reading build logs.

Confidence Score: 2/5

  • Not safe to merge as-is: the missing Windows line-ending fix is a known build-breaking pattern in this repo and several Docker best-practice regressions exist relative to sibling Dockerfiles.
  • Score of 2 reflects one likely build-breaking issue (missing carriage-return strip for Windows builds) plus multiple meaningful deviations from the conventions established by the other Dockerfiles in the repo (COPY ordering, VIRTUAL_ENV, apt-get vs apt). The core approach and version choices are sound, but these issues need to be addressed before the file is ready to land.
  • docker/Dockerfile.newton — the only changed file; all issues are concentrated here.

Important Files Changed

Filename Overview
docker/Dockerfile.newton New Newton-only Dockerfile for headless HPC training; has a P1 issue (missing Windows line-ending fix that could break builds) and several P2 concerns including COPY ordering that defeats layer caching, missing VIRTUAL_ENV export, non-standard apt usage, and a misleading build-time CUDA check.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[nvidia/cuda:12.8.1-devel-ubuntu22.04] --> B[apt-get: Python 3.12 + build deps]
    B --> C[python3.12 -m venv /opt/isaaclab-venv]
    C --> D[pip install torch==2.10.0+cu128\ntorchvision==0.25.0+cu128]
    D --> E[COPY . /workspace/isaaclab]
    E --> F[isaaclab.sh -i\ninstalls all submodules + RL frameworks\nalso re-runs _ensure_cuda_torch — skipped if version matches]
    F --> G[Singularity compat stubs\nnvidia-smi, localtime, etc.]
    G --> H[Build-time verification\nimport isaaclab_newton, newton, warp, torch]
    H --> I[WORKDIR /workspace/isaaclab]

    style D fill:#ffd700,color:#000
    style F fill:#ffd700,color:#000
    style E fill:#ff8c8c,color:#000
Loading

Reviews (1): Last reviewed commit: "Add Newton-only Dockerfile for headless ..." | Re-trigger Greptile

Comment on lines +68 to +69
COPY . ${ISAACLAB_PATH}
RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing Windows line-ending fix for shell scripts

All other Dockerfiles in this repo (Dockerfile.base, Dockerfile.curobo) include a step to strip Windows-style \r carriage returns from .sh files immediately after the COPY step:

RUN find ${ISAACLAB_PATH} -type f -name "*.sh" -exec sed -i 's/\r$//' {} +

If this image is built on Windows (Docker Desktop / WSL), isaaclab.sh may contain \r\n line endings, causing the build to fail at the RUN ${ISAACLAB_PATH}/isaaclab.sh -i step with:

bash: /workspace/isaaclab/isaaclab.sh: /usr/bin/env bash^M: bad interpreter: No such file or directory

The fix should be added between the COPY and chmod steps, consistent with the other Dockerfiles.

Comment on lines +68 to +73
COPY . ${ISAACLAB_PATH}
RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh

# ---- Install Isaac Lab packages via isaaclab.sh ----
# This installs all extensions including Newton, RL frameworks, etc.
RUN ${ISAACLAB_PATH}/isaaclab.sh -i
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 COPY . before heavy install step defeats layer caching

COPY . ${ISAACLAB_PATH} copies the entire repository in a single step, directly before the expensive isaaclab.sh -i installation layer (~2 GB+). Any change to any file in the repo (including scripts, tests, or docs) will invalidate that layer and force a full re-install from scratch.

The other Dockerfiles (Dockerfile.base, Dockerfile.curobo) use a selective, two-stage copy to separate rarely-changing config from frequently-changing source code:

# Stage 1: copy only files that drive the install (rarely change)
COPY isaaclab.* environment.yml pyproject.toml ${ISAACLAB_PATH}/
COPY tools/ ${ISAACLAB_PATH}/tools/
COPY source/ ${ISAACLAB_PATH}/source/

RUN find ${ISAACLAB_PATH} -type f -name "*.sh" -exec sed -i 's/\r$//' {} +
RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh
RUN ${ISAACLAB_PATH}/isaaclab.sh -i

# Stage 2: copy the rest (frequently changes, but cheap to redo)
COPY . ${ISAACLAB_PATH}

This pattern ensures that iterating on scripts, tests, or documentation does not require re-downloading all Python dependencies.

Comment on lines +54 to +56
# ---- Python virtual environment ----
RUN python3.12 -m venv /opt/isaaclab-venv
ENV PATH="/opt/isaaclab-venv/bin:$PATH"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 VIRTUAL_ENV not exported alongside PATH

The venv is activated by prepending its bin/ to PATH, but the VIRTUAL_ENV environment variable is never set. Many Python tooling components — including the command_install logging in isaaclab/cli/commands/install.py and pip itself — check VIRTUAL_ENV to confirm they are operating inside a virtual environment. Without it, tools that inspect this variable will not detect the venv, which can lead to unexpected behaviour (e.g., packages being installed into the wrong location if PATH resolution is ever disrupted).

Suggested change
# ---- Python virtual environment ----
RUN python3.12 -m venv /opt/isaaclab-venv
ENV PATH="/opt/isaaclab-venv/bin:$PATH"
RUN python3.12 -m venv /opt/isaaclab-venv
ENV VIRTUAL_ENV=/opt/isaaclab-venv
ENV PATH="/opt/isaaclab-venv/bin:$PATH"

Comment on lines +51 to +52
apt -y autoremove && apt clean autoclean && \
rm -rf /var/lib/apt/lists/*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 apt used instead of apt-get in non-interactive cleanup

apt is intended for interactive terminal use and may produce warnings (or behave differently) in non-interactive scripts. Docker best-practice is to use apt-get, which has a stable, scriptable CLI. All other Dockerfiles in this repo use apt-get exclusively.

Additionally, apt clean autoclean is not valid apt syntax — apt accepts one subcommand at a time, so autoclean is silently ignored here. The correct invocation uses apt-get:

Suggested change
apt -y autoremove && apt clean autoclean && \
rm -rf /var/lib/apt/lists/*
apt-get -y autoremove && apt-get clean && apt-get autoclean && \

Comment on lines +89 to +97
# ---- Build-time verification ----
RUN python -c "\
import isaaclab; \
import isaaclab_newton; \
import newton; \
import warp; \
import torch; \
print(f'torch {torch.__version__}, CUDA available: {torch.cuda.is_available()}'); \
print('All Newton imports OK')"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Build-time CUDA check will always report CUDA as unavailable

torch.cuda.is_available() requires access to a GPU at build time. Docker build does not expose GPUs by default (even with --gpus all support, CUDA binaries are not present in the build layer at this stage). This means the print statement will always output:

torch.cuda.is_available(): False

This does not fail the build (since there's no assertion), but it will mislead users scanning build logs into thinking the CUDA setup is incorrect. Consider removing the CUDA availability check from the build-time verification, or adding a comment clarifying that False is expected here and GPU availability is only confirmed at container runtime.

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR adds docker/Dockerfile.newton — a standalone, lightweight container for headless RL training using Isaac Lab's Newton physics backend without Isaac Sim/Omniverse. It targets CUDA 12.8, Python 3.12, PyTorch 2.10.0, and is designed for Singularity/Apptainer compatibility on HPC/SLURM clusters.

Design Assessment

Is a separate Dockerfile the right approach? Mostly yes, but with caveats.

The existing Dockerfile.base and Dockerfile.curobo are tightly coupled to Isaac Sim — they derive from nvcr.io/nvidia/isaac-sim and assume the Isaac Sim runtime is present. A Newton-only image without Isaac Sim requires a fundamentally different base (nvidia/cuda), different Python setup (venv instead of Isaac Sim's bundled Python), and no _isaac_sim symlink. This justifies a separate Dockerfile rather than a build-arg in Dockerfile.base.

However, the current PR adds the Dockerfile in isolation without integrating it into the existing Docker infrastructure (docker-compose.yaml, container.py tooling). This limits discoverability and means users cannot use the standard container.py build newton workflow.

Architecture Impact

  • Low blast radius: Single new file, no modifications to existing code.
  • Infrastructure gap: Not wired into docker-compose.yaml or container.py. Users who discover the Newton image through the existing Docker docs/tooling won't find it.
  • Documentation gap: No updates to docs/source/deployment/docker.rst or the Newton installation docs to reference this Dockerfile.
  • No CI validation: There's no workflow to build-test this Dockerfile, so it can silently break on dependency updates.

Implementation Verdict

Minor fixes needed — The Dockerfile itself is well-structured and functional, but has several issues that should be addressed before merging.

Test Coverage

No CI builds or tests for this Dockerfile. The existing docker/test/test_docker.py does not appear to cover Newton images. The build-time python -c "import ..." verification is a good smoke test but only validates import, not actual simulation.

CI Status

  • labeler: passed
  • ⏸️ pre-commit, license-check, build, docs, check-links: awaiting approval (first-time contributor; author_association: NONE)
  • Mergeable: ✅ Yes

Findings

🔴 High: Copyright year mismatch

Line 1: 2022-2026 — all existing Dockerfiles on develop use 2022-2025. This will likely fail the license/copyright check CI once it's approved to run. Align with the existing convention or confirm the year range has been officially bumped repo-wide.

🟡 Medium: No docker-compose.yaml / container.py integration

The standard Isaac Lab Docker workflow uses python docker/container.py build <profile>. This Dockerfile exists outside that system — there's no .env.newton, no service definition in docker-compose.yaml, and container.py won't recognize newton as a profile. Consider adding a docker-compose.newton.yaml overlay or at minimum a newton service in the main compose file.

🟡 Medium: COPY . cache invalidation

Line 71: COPY . ${ISAACLAB_PATH} copies the entire repo in a single layer. Compare with Dockerfile.base which does a two-stage COPY — first copying only isaaclab.sh, pyproject.toml, tools/, and source/ (for dependency installation), then copying the rest at the very end. This means any file change in the repo (README, docs, scripts) invalidates the Isaac Lab install layer, forcing a full isaaclab.sh -i rebuild (~10+ minutes). Adopt the two-stage COPY pattern:

COPY isaaclab.* environment.yml pyproject.toml ${ISAACLAB_PATH}/
COPY tools/ ${ISAACLAB_PATH}/tools/
COPY source/ ${ISAACLAB_PATH}/source/
RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh
RUN ${ISAACLAB_PATH}/isaaclab.sh -i
# ... then later:
COPY . ${ISAACLAB_PATH}

🟡 Medium: Missing --install / -i flag consistency

Line 76: Uses isaaclab.sh -i which is correct and equivalent to --install. However, Dockerfile.base uses isaaclab.sh --install. For consistency across Dockerfiles, consider using the long form --install.

🟡 Medium: Missing find ... -exec sed -i 's/\r$//' for line endings

Dockerfile.base (line ~55) runs find ${ISAACLAB_PATH} -type f -name "*.sh" -exec sed -i 's/\r$//' {} + to fix Windows line endings before chmod +x. The Newton Dockerfile skips this. If a contributor on Windows builds this, isaaclab.sh may fail with \r in shebang. Add the same fix.

🟡 Medium: No --mount=type=cache for pip

Line 76: Dockerfile.base uses RUN --mount=type=cache,target=${DOCKER_USER_HOME}/.cache/pip for the install step. The Newton Dockerfile uses --no-cache-dir everywhere and has no pip cache mount. This is fine for single builds but significantly slower for iterative development. Consider adding the cache mount for the isaaclab.sh -i step.

🟢 Low: No ENTRYPOINT or CMD

The Dockerfile has no ENTRYPOINT or CMD. While this is fine for docker run ... bash -c "...", adding a sensible default like CMD ["bash"] (matching Dockerfile.base's compose entrypoint) improves usability.

🟢 Low: Missing apt-get clean in the apt layer

Line 49-51: Uses apt -y autoremove && apt clean autoclean but the convention in Dockerfile.base is apt-get -y autoremove && apt-get clean. Using apt directly triggers a warning about CLI instability. Minor, but worth aligning.

🟢 Low: libgl1-mesa-glx may be unnecessary

Line 45: For a headless training container that explicitly doesn't do rendering, libgl1-mesa-glx (OpenGL) and libusb-1.0-0 (USB device access) seem unnecessary. If Newton truly runs headless without any GL context, removing these reduces image size. Verify whether any transitive dependency (e.g., OpenCV imported by torchvision) actually needs them.

🟢 Low: Hardcoded PyTorch versions

Lines 62-65: PyTorch 2.10.0+cu128 and torchvision 0.25.0+cu128 are hardcoded. The install docs mention torch==2.10.0 (without +cu128 suffix) and use --index-url to resolve the CUDA variant. The +cu128 explicit suffix works but diverges from the documented pattern. Consider using ARG for version pinning to allow easy overrides.

ℹ️ Info: TZ=$(date +%Z) bakes build-time timezone

Line 99: echo "export TZ=$(date +%Z)" captures the timezone of the build host, not the runtime host. This is inherited from Dockerfile.base so it's a pre-existing pattern, but worth noting that containers built in different timezones will have different TZ values baked in.


Overall: This is a good, well-motivated addition. The Dockerfile is functional and the Singularity compatibility section shows the author understands the HPC use case. The main asks are: (1) fix the copyright year, (2) adopt the two-stage COPY for cache efficiency, (3) add the line-ending fix, and (4) consider docker-compose integration. Everything else is polish.

@@ -0,0 +1,104 @@
# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Copyright year mismatch: All existing Dockerfiles on develop use 2022-2025. This file has 2022-2026. This will likely fail the license/copyright CI check once approved. Please align with the repo convention.

COPY . ${ISAACLAB_PATH}
RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh

# ---- Install Isaac Lab packages via isaaclab.sh ----
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Cache invalidation: This copies the entire repo in one shot. Dockerfile.base uses a two-stage pattern — first copying only files needed for dependency installation (isaaclab.*, pyproject.toml, tools/, source/), running isaaclab.sh --install, then copying the rest. With the current single COPY, any change to any file (README, docs, scripts) busts the cache and forces a full isaaclab.sh -i rebuild.

Suggested:

COPY isaaclab.* environment.yml pyproject.toml ${ISAACLAB_PATH}/
COPY tools/ ${ISAACLAB_PATH}/tools/
COPY source/ ${ISAACLAB_PATH}/source/
RUN find ${ISAACLAB_PATH} -type f -name "*.sh" -exec sed -i 's/\r$//' {} +
RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh
RUN ${ISAACLAB_PATH}/isaaclab.sh -i
# Copy remaining files (scripts, docs, etc.)
COPY . ${ISAACLAB_PATH}

RUN chmod +x ${ISAACLAB_PATH}/isaaclab.sh

# ---- Install Isaac Lab packages via isaaclab.sh ----
# This installs all extensions including Newton, RL frameworks, etc.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Missing Windows line-ending fix: Dockerfile.base runs find ${ISAACLAB_PATH} -type f -name "*.sh" -exec sed -i 's/\r$//' {} + before chmod +x. Without this, Windows contributors who have core.autocrlf=true will get a broken isaaclab.sh (bash will choke on \r in the shebang). Add the same line-ending fix before the chmod.

RUN ${ISAACLAB_PATH}/isaaclab.sh -i

# ---- Singularity/Apptainer compatibility ----
RUN touch /bin/nvidia-smi && \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Missing pip cache mount: Dockerfile.base uses RUN --mount=type=cache,target=/root/.cache/pip for the install step. Consider:

RUN --mount=type=cache,target=/root/.cache/pip ${ISAACLAB_PATH}/isaaclab.sh -i

Also, for consistency with Dockerfile.base, prefer --install (long form) over -i.

libgl1-mesa-glx \
libusb-1.0-0 \
ncurses-term \
wget && \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Minor: apt -y autoremove && apt clean autoclean triggers a stability warning from apt. Use apt-get -y autoremove && apt-get clean && rm -rf /var/lib/apt/lists/* to match Dockerfile.base convention. (The rm -rf is already partially here but combined with autoclean which isn't needed with apt-get clean.)

python3.12-venv \
build-essential \
cmake \
git \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Consider: For a headless training container, do you actually need libgl1-mesa-glx (OpenGL) and libusb-1.0-0 (USB)? These add image size for what's meant to be a lightweight container. If they're transitive deps (e.g., OpenCV via torchvision), keep them but add a comment explaining why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant