Skip to content

Routh terrain#5143

Draft
ooctipus wants to merge 7 commits intoisaac-sim:developfrom
ooctipus:routh_terrain
Draft

Routh terrain#5143
ooctipus wants to merge 7 commits intoisaac-sim:developfrom
ooctipus:routh_terrain

Conversation

@ooctipus
Copy link
Copy Markdown
Collaborator

@ooctipus ooctipus commented Apr 1, 2026

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.
List any dependencies that are required for this change.

Fixes # (issue)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (existing functionality will not work without user modification)
  • Documentation update

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

ooctipus added 5 commits April 1, 2026 14:45
commit 3a3da73
Author: Jichuan Hu <leo.hu.sh@gmail.com>
Date:   Mon Mar 23 19:07:48 2026 -0700

    Add Newton physics preset for Anymal-D rough terrain

    Add RoughPhysicsCfg preset enabling rough terrain locomotion on Newton.
    Key solver settings: pyramidal friction cone (elliptic diverges in
    float32 with many mesh contacts), Newton collision pipeline, increased
    buffer sizes for mesh terrain contacts.

    - Add RoughPhysicsCfg(PresetCfg) with Newton solver config
    - Disable RayCaster height scanner for Newton (requires Kit)
    - Fix zero contact margin from USD import causing CCD divergence
    - Increase CollisionPipeline max_triangle_pairs for mesh terrain
    - Bump mujoco/mujoco-warp to 3.6.0, pin Newton to 141baffff9

    Tested with: conda run -n env_isaaclab_develop python
      scripts/reinforcement_learning/rsl_rl/train.py
      --task Isaac-Velocity-Rough-Anymal-D-v0
      --num_envs 4096 --max_iterations 300 --headless presets=newton
    Result: zero NaN, reward 13.33, episode length 957, 14 min.
commit 01d83a3
Author: Octi Zhang <zhengyuz@nvidia.com>
Date:   Tue Mar 24 14:05:43 2026 -0700

    address comment

commit 314991b
Author: Octi Zhang <zhengyuz@nvidia.com>
Date:   Mon Mar 23 22:26:13 2026 -0700

    fix linting

commit 8b8459b
Author: Octi Zhang <zhengyuz@nvidia.com>
Date:   Mon Mar 23 22:20:22 2026 -0700

    unify more functions
@github-actions github-actions bot added the asset New asset feature or request label Apr 1, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #5143 — Rough Terrain (Newton + PhysX backend unification)

This is a large PR (~62 files) that adds rough terrain support for the Newton physics backend while refactoring several cross-cutting concerns (XformPrimView hierarchy, material randomization, collision offsets, spawner physics properties, visual material randomization). I've focused my review on correctness and design issues in the core implementation files.

Summary

Good:

  • Clean BaseXformPrimViewXformPrimView hierarchy with factory pattern
  • Newton material property API (set_friction_index/mask, set_restitution_index/mask) follows established patterns
  • Solid test coverage for new Newton APIs
  • CollisionPropertiesCfg on TerrainImporterCfg is a useful addition

Issues found: 3 bugs (1 critical, 2 moderate), 2 design concerns, 1 resource leak


CI Status: pre-commit failing (formatting — missing blank lines in rough_env_cfg.py).

cls._needs_collision_pipeline = not use_mujoco_contacts
cls._needs_collision_pipeline = not solver_cfg.use_mujoco_contacts
if cls._collision_cfg is not None:
raise ValueError("NewtonManager: collision_cfg is not supported with use_mujoco_contacts=True")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug (critical): Validation logic is broken — checks stale class variable before it's set from config

This validation has two problems:

  1. Timing: cls._collision_cfg is still the stale class-level default (None on first run, or leftover from a previous simulation session) because it isn't assigned until line 597. This means the check never fires on first run.

  2. Logic: The check runs unconditionally after _needs_collision_pipeline is set, but the error message says "collision_cfg is not supported with use_mujoco_contacts=True". When use_mujoco_contacts=False, _needs_collision_pipeline=True and we want collision_cfg to be set — so this check would wrongly reject valid configs if it ever fired.

  3. Missing cleanup: _collision_cfg is not reset in clear() (line 344-370), so a stale value from a previous Newton session persists.

Suggested change
raise ValueError("NewtonManager: collision_cfg is not supported with use_mujoco_contacts=True")
if isinstance(cls._solver, SolverMuJoCo):
cls._needs_collision_pipeline = not solver_cfg.use_mujoco_contacts
incoming_collision_cfg = getattr(cfg, "collision_cfg", None)
if solver_cfg.use_mujoco_contacts and incoming_collision_cfg is not None:
raise ValueError(
"NewtonManager: collision_cfg is not supported with use_mujoco_contacts=True. "
"Either set use_mujoco_contacts=False or remove collision_cfg."
)

Also add cls._collision_cfg = None to clear() around line 361.

# note: workaround since Articulation doesn't expose shapes per body directly
num_shapes_per_body = []
for link_path in asset.root_view.link_paths[0]: # type: ignore[union-attr]
link_physx_view = asset.root_view.create_rigid_body_view(link_path) # type: ignore
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug (moderate): asset.root_view.create_rigid_body_view() doesn't exist on physx.ArticulationView

The old code used self.asset._physics_sim_view.create_rigid_body_view(link_path) where _physics_sim_view is a physx.SimulationView. The new code changed this to asset.root_view.create_rigid_body_view(link_path) where root_view is a physx.ArticulationView.

physx.ArticulationView does not have a create_rigid_body_view() method — that's on physx.SimulationView. Every other usage in the codebase (imu.py, ray_caster.py, rigid_object.py, contact_sensor.py, etc.) calls it on _physics_sim_view.

This will raise AttributeError at runtime when use_body_ids=True on PhysX (i.e., material randomization on an articulation with specific body_names that aren't ".*").

Suggested change
link_physx_view = asset.root_view.create_rigid_body_view(link_path) # type: ignore
for link_path in asset.root_view.link_paths[0]: # type: ignore[union-attr]
link_physx_view = asset._physics_sim_view.create_rigid_body_view(link_path) # type: ignore

del stage

common_root = os.path.commonpath(paths) if len(paths) > 1 else os.path.dirname(paths[0])
temp_dir = tempfile.mkdtemp(prefix="isaaclab_usd_")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resource leak: temp directories are never cleaned up

tempfile.mkdtemp() creates a persistent directory that is never removed. Every call to _create_modified_usd_with_overrides (i.e., every USD spawn with physics props or randomizable materials) leaks a directory under /tmp/isaaclab_usd_*.

For training runs with num_envs=4096 this may only create one temp dir (since the spawner is called once before cloning), but it still accumulates across sessions.

Consider using atexit cleanup, or at minimum documenting that the temp dir persists. A simple fix:

import atexit

_temp_dirs_to_cleanup: list[str] = []

def _cleanup_temp_dirs():
    for d in _temp_dirs_to_cleanup:
        shutil.rmtree(d, ignore_errors=True)

atexit.register(_cleanup_temp_dirs)

Then append temp_dir to _temp_dirs_to_cleanup after creation.

@@ -209,17 +210,6 @@ class EventsCfg:
@configclass
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking change: physics_material startup event removed from StartupEventsCfg

The physics_material event term (which called randomize_rigid_body_material with friction 0.8/0.6 and restitution 0.0) was completely removed from StartupEventsCfg. Any downstream task config that inherits from StartupEventsCfg and expected material randomization at startup will silently lose it.

This is a behavioral change that should be:

  1. Called out explicitly in the changelog (currently only mentions collider_offsets addition)
  2. Justified — was material randomization intentionally removed, or was this an oversight during refactoring?

If intentional (e.g., because the terrain physics_material config now handles it), the changelog should document the removal.


# Get applied API schemas
schemas = prim_spec.GetInfo("apiSchemas")
apis = set()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design concern: _create_modified_usd_with_overrides skips deformable_props and joint_drive_props application

The old _spawn_from_usd_file applied ALL physics properties at runtime via schemas.modify_* functions:

  • rigid_props, collision_props, mass_props, articulation_props → now handled by layer copying ✅
  • fixed_tendons_props, spatial_tendons_props → still applied at runtime ✅
  • joint_drive_props → listed in prop_schema_map for layer copying ✅
  • deformable_props → listed in prop_schema_map for layer copying ✅

However, the layer-copying approach matches attributes by constructing f"{prefix}{to_camel_case(field, 'cC')}". This assumes a 1:1 mapping between config field names and USD attribute names. If any config field name doesn't match the USD attribute name after camelCase conversion (which was handled by the schemas.modify_* functions' explicit mapping), properties will silently not be applied.

For example, schemas.modify_rigid_body_properties has explicit logic for special attributes like enable_gyroscopic_forces, retain_accelerations, etc. The generic layer-copy approach might miss attributes that don't follow the simple prefix + camelCase(field) naming.

I'd recommend adding a test that verifies ALL physics properties from RigidBodyPropertiesCfg are actually written to the modified USD, comparing against the old schemas.modify_rigid_body_properties output.

def get_scales(self, indices: Sequence[int] | None = None) -> torch.Tensor:
model = NewtonManager.get_model()
shape_scale = wp.to_torch(model.shape_scale)
shape_body = wp.to_torch(model.shape_body)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design note: get_scales uses a Python loop over bodies — O(N*M) complexity

The get_scales implementation iterates over each body in Python, creates a boolean mask, and indexes into shape_scale for each one. For views with many prims (e.g., 4096 envs), this is O(N * num_shapes) with Python overhead per iteration.

This works but may be a performance bottleneck for large-scale scenes. Consider a Warp kernel or vectorized approach if profiling shows this is hot.

Not blocking, but worth a TODO.


If ``True`` (default), MuJoCo handles collision detection and contact resolution internally.
If ``False``, MuJoCo's collision detection is disabled, and Newton's :class:`CollisionPipeline`
must be used instead (set :attr:`NewtonCfg.collision_cfg`).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc/config inconsistency: use_mujoco_contacts docs say to set collision_cfg, but no runtime validation prevents misuse

The docstring warns:

If use_mujoco_contacts=False, you must also set collision_cfg to a NewtonCollisionPipelineCfg instance. Mismatched settings will cause simulation errors (e.g., NaN values).

But as noted in my other comment, the validation in newton_manager.py:initialize_solver is broken. A user who sets use_mujoco_contacts=False without setting collision_cfg will get an AttributeError: 'NoneType' object has no attribute 'to_dict' at _initialize_contacts line 518, which is not a helpful error message.

After fixing the validation logic in newton_manager.py, also add:

if not solver_cfg.use_mujoco_contacts and incoming_collision_cfg is None:
    raise ValueError(
        "NewtonManager: use_mujoco_contacts=False requires collision_cfg to be set. "
        "Provide a NewtonCollisionPipelineCfg instance."
    )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

asset New asset feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant