Skip to content

Feature/leapp export integration#5105

Open
frlai wants to merge 16 commits intoisaac-sim:developfrom
frlai:feature/leapp_export_integration
Open

Feature/leapp export integration#5105
frlai wants to merge 16 commits intoisaac-sim:developfrom
frlai:feature/leapp_export_integration

Conversation

@frlai
Copy link
Copy Markdown

@frlai frlai commented Mar 24, 2026

Description

FEATURE: adds new LEAPP export functionality that targets manged environments that using rsl_rl. Policies can be exported end to end. Also adds a new direct deployment environment to deploy exported policies, bypassing all the manager classes.

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

Additional TODOs:

  • Add documentation
  • Add integration into direct environment documentation [DONE]
  • Edit tests to work with without pretrained checkpoints
  • Type-hinting and general function name update

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 24, 2026

Greptile Summary

This PR introduces end-to-end LEAPP export support for RSL-RL manager-based environments and a new DirectDeploymentEnv that runs exported policies in simulation without Isaac Lab's training managers. It also adds pervasive @leapp_tensor_semantics annotations across articulation data, sensor data, and command terms so that LEAPP can automatically wire I/O during export.

Key changes and concerns:

  • CircularBuffer regression (circular_buffer.py): The refactor to support export tracing removes an important guard — calling __getitem__ when _num_pushes == 0 (e.g. right after reset() before the first append()) produces index_in_buffer = max_length, an out-of-bounds tensor index. The docstring still documents RuntimeError: If the buffer is empty but the check is gone. Additionally, _append now calls torch.cat every step, creating a fresh (max_length, batch_size, …) tensor each time instead of an in-place pointer write — a potential throughput regression for tasks with long history windows.
  • deploy.py resource leak: env.close() is inside the try block after the main loop; it is never called when KeyboardInterrupt interrupts env.step(). A finally block is needed.
  • export.py typo: Variable vilidate (line 289) is passed directly as validate=vilidate to leapp.compile_graph().
  • DirectDeploymentEnv: Well-structured; the I/O resolver correctly handles state:, command:, and write: connection strings. The hard-coded cfg.scene.num_envs = 1 in __init__ is consistent with single-env deployment intent.
  • LEAPP semantic annotations: The @leapp_tensor_semantics decorator is cleanly applied across the entire sensor/asset/command API surface; MRO-based semantic resolution in the proxy layer is correct.
  • The PR author notes that documentation, tests without pretrained checkpoints, and type-hinting are still outstanding TODOs.

Confidence Score: 3/5

  • Not yet safe to merge — the CircularBuffer out-of-bounds regression affects existing training/play workflows, not just the new export path.
  • The LEAPP export infrastructure and DirectDeploymentEnv are well-engineered, but the CircularBuffer refactor introduces a concrete out-of-bounds index regression (index_in_buffer = max_length when _num_pushes == 0) that affects any task using observation history — a broad regression outside the new feature scope. The deploy.py resource leak and export.py typo are minor but should be fixed before merge. The PR also self-notes missing documentation, tests without pretrained checkpoints, and type-hinting as outstanding TODOs.
  • source/isaaclab/isaaclab/utils/buffers/circular_buffer.py needs the empty-buffer guard restored; scripts/reinforcement_learning/deploy.py needs a finally block for env.close().

Important Files Changed

Filename Overview
scripts/reinforcement_learning/deploy.py New deployment script for LEAPP-exported policies; env.close() is never called on KeyboardInterrupt due to missing finally block.
scripts/reinforcement_learning/rsl_rl/export.py New RSL-RL LEAPP export script; contains a typo (vilidate) passed directly to leapp.compile_graph(validate=…), and disables TorchScript globally before imports which may surprise downstream users.
source/isaaclab/isaaclab/envs/direct_deployment_env.py New environment that wires LEAPP InferenceManager to scene entities; well-structured I/O resolution, but hardcodes num_envs = 1 without validation against the loaded LEAPP model.
source/isaaclab/isaaclab/utils/buffers/circular_buffer.py Refactored to use torch.cat in _append (tensor allocation every step) and removed the empty-buffer guard in __getitem__, introducing a potential out-of-bounds index when _num_pushes == 0.
source/isaaclab/isaaclab/utils/leapp/export_annotator.py Complex proxy-based patcher for annotation; disables training managers cleanly, uses a unified dedup cache for obs/action tensors, and correctly patches history buffers as LEAPP state.
source/isaaclab/isaaclab/utils/leapp/proxy.py Proxy classes for transparent observation/action annotation; MRO-walking semantic resolution is correct and the dedup cache key (id(real_data), name) is appropriate.
source/isaaclab/isaaclab/utils/leapp/leapp_semantics.py Clean semantic metadata decorator and element-name resolver helpers; no issues found.
source/isaaclab/isaaclab/sensors/ray_caster/ray_caster_data.py Fields converted to private with public properties for LEAPP annotation; minor: original quat_w docstring said "(x, y, z, w)" but new docstring says "(w, x, y, z)" — verify actual data convention matches QUAT_WXYZ.
source/isaaclab_rl/test/test_rsl_rl_export_flow.py Integration tests launch export.py as a subprocess per task; gracefully skips missing checkpoints; no issues found.
source/isaaclab/isaaclab/assets/articulation/base_articulation_data.py LEAPP semantic decorators added to all abstract properties; type annotations updated from list[str] = None to `list[str]
source/isaaclab/isaaclab/managers/manager_term_cfg.py Added cmd_kind and element_names fields to CommandTermCfg for deployment metadata; inline comments explain intent but formal docstrings would be cleaner for a public config class.

Sequence Diagram

sequenceDiagram
    participant U as User / Script
    participant EX as export.py
    participant EA as ExportPatcher
    participant ENV as ManagerBasedRLEnv
    participant LP as leapp (annotate)
    participant DD as DirectDeploymentEnv
    participant IM as InferenceManager (LEAPP)

    Note over U,LP: Export Flow
    U->>EX: run export.py --task --use_pretrained_checkpoint
    EX->>EA: patch_env_for_export(env, task_name, method)
    EA->>ENV: patch obs_manager, action_manager, history buffers
    EX->>LP: leapp.start(task_name)
    EX->>ENV: env.reset()
    loop validation_steps
        EX->>ENV: policy(obs)  [traced through annotated proxies]
        ENV-->>LP: annotate.input_tensors(state reads)
        ENV-->>LP: annotate.output_tensors(action writes)
        EX->>ENV: env.step(actions)
    end
    EX->>LP: leapp.stop()
    EX->>LP: leapp.compile_graph() → .onnx + .yaml

    Note over U,IM: Deployment Flow
    U->>DD: DirectDeploymentEnv(cfg, leapp.yaml)
    DD->>IM: InferenceManager(leapp.yaml)
    DD->>DD: _resolve_io() [parse state:/command:/write: connections]
    loop simulation running
        DD->>DD: _read_inputs() [scene.data.property / command_manager]
        DD->>IM: run_policy(inputs)
        IM-->>DD: outputs
        DD->>DD: _write_outputs() [entity.set_joint_*_target_*]
        DD->>DD: decimation physics loop
    end
Loading

Reviews (1): Last reviewed commit: "precommit updates" | Re-trigger Greptile

@github-actions github-actions bot added enhancement New feature or request isaac-lab Related to Isaac Lab team labels Mar 25, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

PR: #5105 — Feature/LEAPP Export Integration
Author: @frlai | Changes: +2684/−56 across 33 files


Summary

This PR adds end-to-end LEAPP export functionality for RSL-RL policies trained in manager-based Isaac Lab environments, plus a new DirectDeploymentEnv that can replay exported policies by bypassing all manager classes. The approach uses proxy objects to intercept tensor reads/writes during a tracing pass, then annotates them with semantic metadata for LEAPP graph compilation.

Core new modules:

  • isaaclab/utils/leapp/ — export annotator, proxy chain, semantic metadata decorators, utility helpers
  • isaaclab/envs/direct_deployment_env.py — lightweight deployment env wired via YAML connection strings
  • scripts/reinforcement_learning/rsl_rl/export.py — export script
  • scripts/reinforcement_learning/deploy.py — deployment script
  • isaaclab_rl/test/test_rsl_rl_export_flow.py — integration tests (60+ tasks)

Cross-cutting changes:

  • @leapp_tensor_semantics decorators on ~150+ properties across articulation, rigid object, rigid object collection, contact sensor, frame transformer, IMU, and ray caster data classes
  • CircularBuffer rewritten from pointer-based ring to shift-based append (for traceability)
  • RayCasterData fields converted from plain attributes to @property + private backing fields
  • CommandTermCfg extended with cmd_kind and element_names
  • Minor observation function fixes for tracing compatibility (humanoid, deploy, dexsuite)

Architecture Assessment

Strengths:

  1. The proxy-based approach (_EnvProxy_SceneProxy_EntityProxy_DataProxy) is genuinely elegant — it avoids modifying any concrete data class implementations and resolves semantics lazily via MRO walking.
  2. The shared dedup cache between observation and action paths ensures a single LEAPP input edge for properties read from both sides.
  3. The isaaclab_connection protocol in the YAML provides a clean deployment contract between export and inference.
  4. Separating export annotation (export_annotator.py) from deployment wiring (direct_deployment_env.py) is a good architectural choice.

Concerns:

  1. Hard dependency on leapp at import time in core modules. Every data class (base_articulation_data.py, base_rigid_object_data.py, all sensor data classes) now imports from leapp import InputKindEnum and from isaaclab.utils.leapp.leapp_semantics import ... at module level. This means leapp becomes a mandatory dependency for anyone who imports Isaac Lab, even if they never intend to export. This is a significant change to the dependency footprint. Consider making the import conditional or lazy (e.g., TYPE_CHECKING guard for the enum types, with the decorator doing a runtime check).

  2. Missing __init__.py for isaaclab/utils/leapp/ — the package directory has no __init__.py. While implicit namespace packages work in Python 3.3+, Isaac Lab consistently uses explicit __init__.py files. This should be added for consistency and to define the public API.

  3. CircularBuffer rewrite has subtle behavioral change. The shift from torch.roll to torch.cat([self._buffer[1:], data.unsqueeze(0)], dim=0) allocates a new tensor on every append (O(max_len) copy). The old implementation did O(1) pointer-based writes. For large buffers or high-frequency use, this is a performance regression. The tradeoff is valid for export traceability, but it should be documented and benchmarked. Consider keeping the original O(1) path for training and only switching to the traceable path during export.


Code Quality Issues

  1. Typo in export.py line 279: vilidate = args_cli.validation_steps > 0 — should be validate.

  2. export.py docstring says "Train an RL agent" (line 35) — should say "Export an RL agent checkpoint".

  3. RayCasterData.quat_w docstring inconsistency: The new property docstring says (w, x, y, z) but the original field and Isaac Lab convention use (x, y, z, w). The QUAT_WXYZ_ELEMENT_NAMES also uses ["qw", "qx", "qy", "qz"]. Clarify which convention is actually used.

  4. CommandTermCfg fields lack proper docstrings. The new cmd_kind and element_names fields have only inline comments. They should have proper RST-formatted docstrings like other fields in the class.

  5. _first_param_name in direct_deployment_env.py works correctly for bound methods (where inspect.signature strips self), but add a comment clarifying that bound methods are expected input.

  6. direct_deployment_env.py hardcodes num_envs = 1. This is reasonable for deployment but should be documented as a deliberate constraint.


Export Correctness

  1. ensure_torch_tensor silently swallows conversion errors. In utils.py, the except Exception catch-all means that if warp-to-torch conversion fails for any reason, the raw warp array is passed through silently. This could lead to hard-to-debug issues. At minimum, log a warning.

  2. patch_warp_to_torch_passthrough mutates a global function (wp.to_torch). This side-effect persists for the entire process lifetime with no way to unpatch. Consider using a context manager pattern.

  3. History buffer patching (_patch_history_buffer_append) replaces _append on the instance. The new CircularBuffer._append method seems designed specifically for this monkey-patching, but it is not documented as an extension point. If the buffer implementation changes, this silently breaks.

  4. torch.jit._state.disable() at module level in export.py disables TorchScript for the entire process. This is extremely aggressive — if any other code in the pipeline relies on JIT compilation, it breaks silently.


Test Coverage

The test file (test_rsl_rl_export_flow.py) is a solid integration test suite covering 60+ tasks. However:

  1. Tests require pretrained checkpoints — they skip if unavailable. The PR checklist explicitly notes "Edit tests to work without pretrained checkpoints" as a TODO. This means the test suite may be effectively a no-op in CI.

  2. No unit tests for the core modules: export_annotator.py, proxy.py, leapp_semantics.py, direct_deployment_env.py, or the CircularBuffer changes. The proxy chain and cache deduplication logic are complex enough to warrant targeted unit tests.

  3. No deployment testDirectDeploymentEnv is completely untested.


Documentation

The PR checklist acknowledges missing documentation:

  • "Add documentation" — TODO
  • "Add integration into direct environment documentation" — TODO

For a feature of this scope, documentation is critical before merge.


CI Status

labeler — pass (only job that ran)

No lint, build, or test CI jobs executed. This is concerning for a 33-file, 2600+ line PR.


Verdict

The architecture is well-designed and the proxy-based annotation approach is clever. Key items to address:

Priority Issue
P0 Hard leapp dependency in core modules — must be optional/lazy
P0 Missing __init__.py for isaaclab/utils/leapp/
P1 CircularBuffer performance regression — document or mitigate
P1 Typo: vilidatevalidate
P1 No unit tests for core proxy/annotation logic
P2 Documentation required per contribution guidelines
P2 ensure_torch_tensor silent error swallowing
P2 Global side effects (wp.to_torch mutation, torch.jit disable) need documentation

@frlai
Copy link
Copy Markdown
Author

frlai commented Mar 30, 2026

CircularBuffer implementation impact study.

A test was done to evaluate the impact of the changes related to circular buffer. The new implementation uses a torch.roll on append instead of the original pointer appending and torch.roll buffer selection method. The new change was implemented to allow tracing to happen (original pointer selection uses python logic and is invisible to pytorch).

Exparament:

To see actual impact, a test was done on Isaac-Velocity-Flat-Spot-v0 which uses the DelayedPDActuator each step. other configurations are as follows:
num_envs = 256
iterations = 100
physics preset = PhysX
GPU: RTX 4090

Results

Metric current develop (pointer) delta
Mean iter time 0.769 s 0.797 s −3.5% (current faster)
Median iter time 0.770 s 0.790 s −2.5%
p95 iter time 0.800 s 0.830 s −3.7%
Min iter time 0.750 s 0.770 s
Max iter time 0.810 s 0.840 s

Verdict

Test shows no slowdowns and in this test is actually slightly faster across the board (though the difference is statistically insignificant)

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant