Skip to content

Adds optimizations for Isaac Lab Mimic data generation performance#5141

Merged
peterd-NV merged 33 commits intoisaac-sim:developfrom
peterd-NV:peterd/mimic_perf_opt
Apr 10, 2026
Merged

Adds optimizations for Isaac Lab Mimic data generation performance#5141
peterd-NV merged 33 commits intoisaac-sim:developfrom
peterd-NV:peterd/mimic_perf_opt

Conversation

@peterd-NV
Copy link
Copy Markdown
Contributor

@peterd-NV peterd-NV commented Apr 1, 2026

Description

This PR adds performance improvements and additional test case coverage for Isaac Lab Mimic data generation.

Performance improvements:

  • Removal of usage of large state tensors in waypoint.py execute and return task success only
  • Change asyncio_lock usage to be optional for data generation flows which does not require it
  • Improve asynchronous execution in generation.py env_loop
  • Add toggle to clone data in add method of episode_data.py allowing for RecorderManager to clone contiguous tensors before calling add
  • Add option to toggle dataset compression (reduces env step time for state-only data generation)
  • Use tilted cameras for visuomotor envs (Franka & GR1 nut pouring)

Additional test coverage:

  • Data generation test for Franka visuomotor env
  • Data generation test for GR1T2 pick place env
  • Data generation test for GR1T2 nut pouring env

Other changes:

  • Increase data generation test time outs to prevent test failure due to insufficient time
  • Check for and return exception in generation.py env_loop to prevent event loop from spinning indefinitely when exception occurs in data generation.
  • Locks h5py dep version to 3.15.1 (latest version 3.16.0 has import errors on Windows)

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions bot added the isaac-mimic Related to Isaac Mimic team label Apr 1, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR optimizes Mimic data generation by: (1) removing redundant per-step accumulation of states/observations/actions in memory, (2) reducing tensor cloning overhead in recorder_manager/episode_data, (3) switching to batched asyncio.gather for action collection, (4) removing the shared asyncio lock for the standard generation flow, (5) switching visuomotor cameras from Camera to TiledCamera, (6) adding optional HDF5 dataset compression control, and (7) lowering gzip compression level. The core memory optimization (eliminating generated_states/obs/actions lists) is sound since this data is already captured by the recorder manager.

Design Assessment

Design is sound. The core insight is correct: the generate() method was accumulating states, observations, and actions in lists that were never consumed (only success was used from the results dict). These were redundant with what the recorder manager already stores. Removing them eliminates the largest memory sink during long generation runs.

The lock removal is justified — in single-threaded asyncio with no await points inside the critical section, the lock was a no-op from a correctness standpoint. The _optional_lock pattern correctly supports both the lock-free generate_dataset.py path and the lock-requiring consolidated_demo.py path.

The Camera → TiledCamera switch is a standard optimization for multi-env scenarios in Isaac Lab.

One concern: The disable_dataset_compression parameter is threaded through 3 layers (CLI → config → recorder manager → file handler) as a runtime flag rather than being part of the file handler configuration. A cleaner design would put compression settings on the file handler or its config, but the current approach is acceptable short-term given the small scope.

Architecture Impact

  • DatasetFileHandlerBase.write_episode base class signature is now further out of sync with HDF5DatasetFileHandler (pre-existing issue, worsened).
  • EpisodeData.add has a new clone parameter — backwards compatible (default True).
  • MultiWaypoint.execute return type changed from dictbool — the only caller (data_generator.py) is updated accordingly.
  • DataGenerator.generate return dict drops states, observations, actions keys — the only consumer (generation.py:run_data_generator) only uses results["success"].
  • gzip compression level changed from default (4) to 2 for ALL datasets, not just mimic — this is a behavioral change visible to any user of HDF5DatasetFileHandler.

Implementation Verdict

Minor fixes needed — Correct approach, 3 issues to address.

Test Coverage

No new tests for the disable_dataset_compression feature. The existing test_hdf5_dataset_file_handler.py tests call write_episode(test_episode) without the new parameter, so they only cover the default (compression enabled) path. A simple test that writes with disable_dataset_compression=True and verifies the dataset has no compression would be valuable. That said, the existing isaaclab_mimic CI check passes, which covers the end-to-end generation flow.

CI Status

Most checks still in progress (just pushed ~5 min ago). Passed so far: pre-commit ✅, isaaclab_mimic ✅, isaaclab_assets ✅, isaaclab_ov ✅, isaaclab_teleop ✅, labeler ✅, Build Base Docker Image ✅. Remaining checks pending.

Findings

🟡 Warning: hdf5_dataset_file_handler.py:257 — Compression level silently reduced for all users
The compression_opts=2 change reduces gzip compression from the h5py default (level 4) to level 2 for ALL HDF5 dataset writes, not just when mimic is running. This is an undocumented behavioral change that produces larger files. If the intent is to speed up mimic generation, this should either be (a) tied to the disable_dataset_compression flag as a "fast compression" mode, or (b) documented as an intentional project-wide change. As-is, users who rely on the default compression level will get larger files without knowing why.

🟡 Warning: dataset_file_handler_base.py — Base class write_episode signature diverges further
DatasetFileHandlerBase.write_episode(self, episode) doesn't match HDF5DatasetFileHandler.write_episode(self, episode, demo_id, disable_dataset_compression). The demo_id mismatch was pre-existing, but adding disable_dataset_compression widens the gap. Any future DatasetFileHandler subclass won't know about these parameters.

🔵 Improvement: recorder_manager.py:58 — Use positive naming for the compression config flag
disable_dataset_compression leads to confusing double-negation patterns (if not disable_dataset_compression). A positive name like dataset_compression: bool = True reads more naturally: if self.cfg.dataset_compression.

🔵 Improvement: Missing CHANGELOG entries
This PR modifies public APIs in both isaaclab (new clone param on EpisodeData.add, new disable_dataset_compression on RecorderManagerBaseCfg, changed write_episode signature) and isaaclab_mimic (changed return type of MultiWaypoint.execute, changed generate() return dict). Per project conventions, both source/isaaclab/docs/CHANGELOG.rst and source/isaaclab_mimic/docs/CHANGELOG.rst should be updated.

🔵 Improvement: PR description is empty
The PR body uses the default template without filling in the description, motivation, or type of change. For a performance optimization PR, including before/after measurements (memory usage, throughput, wall-clock time) would help reviewers evaluate the tradeoff.

export_in_close: bool = False
"""Whether to export episodes in the close call."""

disable_dataset_compression: bool = False
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negative-sense booleans create confusing double-negation patterns downstream (if not disable_dataset_compression). The other boolean fields in this class use positive naming (export_in_record_pre_reset, export_in_close). This should follow the same convention.

Suggested change
disable_dataset_compression: bool = False
dataset_compression: bool = True
"""Whether to compress data when writing to the dataset file."""

This requires updating the references in recorder_manager.py:export_episodes, generation.py:setup_env_config, generate_dataset.py, and hdf5_dataset_file_handler.py accordingly (passing not self.cfg.dataset_compression or inverting the logic).

@peterd-NV peterd-NV changed the title Adds optimizations for Mimic memory usage/throughput Adds optimizations for Mimic memory usage and throughput Apr 1, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot — Follow-up

New commits (a225857→2f829a28): Added 3 test files for dataset generation (nutpour GR1T2, pickplace GR1T2, visuomotor).

1 new issue found in the added tests.

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot — Follow-up

New commits (3ae515f→a18c6ff4): Fixed TODO_FILL_IN_ENV_NAME placeholders in test files with actual Nucleus paths.

1 new issue found in the updated paths.

@isaac-sim isaac-sim deleted a comment from isaaclab-review-bot bot Apr 3, 2026
@isaac-sim isaac-sim deleted a comment from isaaclab-review-bot bot Apr 3, 2026
@peterd-NV peterd-NV changed the title Adds optimizations for Mimic memory usage and throughput Adds optimizations for Isaac Lab Mimic data generation performance Apr 6, 2026
@peterd-NV peterd-NV marked this pull request as ready for review April 6, 2026 17:42
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 6, 2026

Greptile Summary

This PR adds performance optimizations (batch-level tensor cloning, optional dataset compression, improved async execution) and additional test coverage (Franka visuomotor, GR1T2 pick-place, GR1T2 nut-pour) for the Mimic data generation pipeline. One P1 bug was found: line 98 of generation.py uses and instead of , to compose two context managers, causing contextlib.suppress(KeyboardInterrupt) to be silently discarded — KeyboardInterrupt will propagate uncaught through env_loop instead of being suppressed.

Confidence Score: 4/5

Safe to merge after fixing the context manager composition bug on line 98 of generation.py

One P1 logic bug found: and instead of , silently drops contextlib.suppress(KeyboardInterrupt) in env_loop. All other changes are well-structured performance optimizations and new tests.

source/isaaclab_mimic/isaaclab_mimic/datagen/generation.py line 98

Important Files Changed

Filename Overview
source/isaaclab_mimic/isaaclab_mimic/datagen/generation.py Adds exception propagation and compression toggle; and instead of , in context manager composition silently discards contextlib.suppress(KeyboardInterrupt)
source/isaaclab_mimic/isaaclab_mimic/datagen/waypoint.py Optimized execute to return only task success instead of large state tensors
source/isaaclab/isaaclab/utils/datasets/episode_data.py Added optional clone parameter to add for batch-level cloning optimization
source/isaaclab/isaaclab/managers/recorder_manager.py Optimized add_to_episodes to clone tensor once at batch level instead of once per env
source/isaaclab/isaaclab/utils/datasets/hdf5_dataset_file_handler.py Added optional dataset_compression parameter to write_episode
source/isaaclab_mimic/setup.py Pins h5py to 3.15.1 to avoid Windows import errors in 3.16.0
source/isaaclab_mimic/test/utils.py New shared test utility with run_script helper extracted from previously duplicated test files
source/isaaclab_mimic/test/test_generate_dataset_franka_visuomotor.py New test for Franka visuomotor data generation (single and multi-env)
source/isaaclab_mimic/test/test_generate_dataset_gr1t2_nutpour.py New test for GR1T2 nut-pour data generation (single and multi-env)
source/isaaclab_mimic/test/test_generate_dataset_gr1t2_pickplace.py New test for GR1T2 pick-place data generation (single and multi-env)
source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/stack/config/franka/stack_ik_rel_visuomotor_env_cfg.py Updated camera configuration to use TiledCamera with tilted offsets for visuomotor envs
tools/test_settings.py Added per-test timeout entries for new test files
scripts/imitation_learning/isaaclab_mimic/generate_dataset.py Added --disable_dataset_compression CLI flag and wires it through to setup_env_config

Sequence Diagram

sequenceDiagram
    participant M as main()
    participant S as setup_env_config()
    participant A as setup_async_generation()
    participant L as env_loop()
    participant DG as run_data_generator()
    participant E as env

    M->>S: env_name, compression, num_trials
    S-->>M: env_cfg, success_term
    M->>E: gym.make(env_name, cfg)
    E-->>M: env instance
    M->>A: env, input_file, success_term
    A-->>M: tasks, event_loop, queues
    M->>L: env, queues, data_gen_tasks
    loop Until num_trials reached
        DG->>A: put action in env_action_queue
        L->>E: env.step(actions)
        L->>DG: task_done (unblock join)
        DG->>DG: check success_term
    end
    L-->>M: return
    M->>M: cancel tasks
    M->>E: env.close()
Loading

Comments Outside Diff (1)

  1. source/isaaclab_mimic/isaaclab_mimic/datagen/generation.py, line 98 (link)

    P1 Incorrect context manager composition silently drops suppress

    and is a boolean short-circuit operator, not a way to compose context managers. Since contextlib.suppress(KeyboardInterrupt) is a truthy object, A and B evaluates to B (i.e. torch.inference_mode()). The suppress block is never entered and KeyboardInterrupt will propagate uncaught through env_loop, bypassing the finally cleanup in main().

Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.
Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.

Reviews (2): Last reviewed commit: "fix typo in rl/mimic changelog" | Re-trigger Greptile

Comment thread source/isaaclab_mimic/isaaclab_mimic/datagen/generation.py Outdated
Comment thread source/isaaclab_mimic/test/test_generate_dataset_franka_visuomotor.py Outdated
peterd-NV and others added 2 commits April 6, 2026 10:51
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: peterd-NV <peterd@nvidia.com>
Copy link
Copy Markdown
Contributor

@kellyguo11 kellyguo11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot — Follow-up

New commits (ebf5a4a→36c7013c): Added h5py==3.15.1 pin to isaaclab_mimic and isaaclab_rl, new CHANGELOG entries.

2 new issues found in the added changes.

Comment thread source/isaaclab_rl/config/extension.toml Outdated
Comment thread source/isaaclab_rl/docs/CHANGELOG.rst Outdated
@peterd-NV peterd-NV requested a review from njawale42 April 9, 2026 17:39
Copy link
Copy Markdown
Contributor

@njawale42 njawale42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@peterd-NV peterd-NV merged commit 5673af0 into isaac-sim:develop Apr 10, 2026
19 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

infrastructure isaac-mimic Related to Isaac Mimic team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants