fix(newton): use correct kernel launch dim in actuator effort copy#5113
fix(newton): use correct kernel launch dim in actuator effort copy#5113Cdfghglz wants to merge 1 commit intoisaac-sim:dev/newtonfrom
Conversation
The `update_array2D_with_array2D_masked` kernel in both `ImplicitActuator.compute()` and `IdealPDActuator.compute()` was launched with `dim=(num_envs, self.num_joints)` where `self.num_joints` is the number of joints in the actuator group, not the total joints in the articulation. Since `joint_mask` is indexed over all joints, the kernel never reached joints with indices >= group size. This caused explicit actuator groups (DelayedPD, IdealPD, DCMotor) to produce zero torque for most joints when multiple groups are used, making the robot collapse under gravity. Fix: use `self._num_joints` (= joint_mask.shape[0] = total joints) which is consistent with the other kernel launches in the same methods.
Greptile SummaryThis PR fixes a one-line (per occurrence) kernel launch dimension bug in the Newton backend's Confidence Score: 5/5Safe to merge — minimal, targeted fix with no side effects. The change is exactly 2 characters per line (s/num_joints/_num_joints/), precisely mirrors the pattern already used by every other kernel launch in the same methods, and is backed by the invariant that joint_mask.shape[0] == n_dof which all the shared data arrays are also sized to. The root cause is clear and well-documented, verification was done on actual hardware, and no existing single-group robots are affected. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["actuator.compute()"] --> B["wp.launch compute_pd_actuator\ndim=(num_envs, _num_joints ✓)"]
B --> C["_clip_effort()\nwp.launch clip_efforts_with_limits\ndim=(num_envs, _num_joints ✓)"]
C --> D["wp.launch update_array2D_with_array2D_masked\ndim=(num_envs, _num_joints ✓) FIXED"]
subgraph Before_Fix
D2["wp.launch update_array2D_with_array2D_masked\ndim=(num_envs, num_joints ✗)\nGroup size only — e.g. 4 of 16 joints"]
end
subgraph Kernel_Logic
E["index_1, index_2 = wp.tid()"]
E --> F{"mask_1[index_1] AND mask_2[index_2]?"}
F -- Yes --> G["array_2d[i1,i2] = new_array_2d[i1,i2]"]
F -- No --> H["skip"]
end
D -.-> E
style D fill:#90ee90
style D2 fill:#ffcccb
Reviews (1): Last reviewed commit: "fix(newton): use correct kernel launch d..." | Re-trigger Greptile |
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR fixes a critical kernel launch dimension bug where update_array2D_with_array2D_masked was launched with self.num_joints (joints in the actuator group) instead of self._num_joints (total joints in the articulation). This caused the kernel to never iterate over joint indices beyond the group size, producing zero torque for most joints when multiple explicit actuator groups are used. The fix is correct and minimal.
Architecture Impact
No cross-module impact — the change is self-contained within actuator_pd.py. The two affected call sites (ImplicitActuator.compute() and IdealPDActuator.compute()) now match the launch dimension used by every other kernel in the same file (compute_pd_actuator, clip_efforts_with_limits, clip_efforts_dc_motor), all of which already use self._num_joints. DCMotor inherits from IdealPDActuator and calls super().compute(), so it is also fixed.
Implementation Verdict
Minor fixes needed — The fix itself is correct and should ship. However, the same bug exists in actuator_net.py (see findings below).
Test Coverage
🟡 No regression test included. This is a bug fix that caused silent zero-torque output — one of the worst failure modes in a robotics simulator. A unit test that creates multiple actuator groups and verifies joint_effort is non-zero for joints in all groups would catch this if the bug is reintroduced. Given that the newton backend test infrastructure is still minimal (only benchmarks and mocks exist), I understand the practical difficulty, but this should be tracked as follow-up.
CI Status
Only the labeler check ran and passed ✅. No pre-commit/lint/unit test CI checks appear configured for the dev/newton branch yet.
Findings
🔴 Critical: actuator_net.py:98,195 — Same bug exists in ActuatorNetLSTM.compute() and ActuatorNetMLP.compute()
Both network-based actuator classes launch update_array2D_with_array2D_masked with dim=(self._num_envs, self.num_joints) — the exact same bug this PR fixes in actuator_pd.py. Any user with multiple actuator groups using ActuatorNetLSTM or ActuatorNetMLP will hit the same zero-torque problem. These should be fixed in this PR.
🟡 Warning: No regression test — This bug silently produces incorrect physics. Without a test, nothing prevents re-introduction. Even a minimal test asserting joint_effort is written for high-index joints in a multi-group setup would suffice.
| wp.launch( | ||
| update_array2D_with_array2D_masked, | ||
| dim=(self._num_envs, self.num_joints), | ||
| dim=(self._num_envs, self._num_joints), |
There was a problem hiding this comment.
Fix is correct. self._num_joints (= joint_mask.shape[0] = total joints in articulation) is the right dimension for update_array2D_with_array2D_masked because the kernel indexes into joint_mask using wp.tid(). With the old self.num_joints (= len(self._joint_names) = group size), joints at indices ≥ group size were never reached.
Consistent with compute_pd_actuator at line 138 and _clip_effort in the base class, which both use self._num_joints.
actuator_net.py — ActuatorNetLSTM.compute() (line 98) and ActuatorNetMLP.compute() (line 195) both have dim=(self._num_envs, self.num_joints) on their update_array2D_with_array2D_masked calls. Please fix those too:
actuator_net.py line 98 (ActuatorNetLSTM):
| dim=(self._num_envs, self._num_joints), | |
| dim=(self._num_envs, self._num_joints), |
actuator_net.py line 195 (ActuatorNetMLP):
| dim=(self._num_envs, self._num_joints), | |
| dim=(self._num_envs, self._num_joints), |
Summary
ImplicitActuator.compute()andIdealPDActuator.compute()(Newton backend)update_array2D_with_array2D_maskedkernel was launched withself.num_joints(group size) instead ofself._num_joints(total joints), preventing effort from being written to joints with indices >= group sizeImpact
Any robot using multiple explicit actuator groups on the Newton backend is affected. Robots with a single actuator group (all joints in one group) are not affected, nor are robots using only implicit actuators (Newton handles PD internally).
Example: A quadruped with 4 actuator groups (rockers, hips, front_knees, rear_knees) of 4 joints each across 16 total joints. The kernel iterates
joint_index = 0..3butjoint_maskhas True entries at positions like [6, 11, 12, 14] — these are never reached, producing zero torque.Fix
self.num_joints→self._num_jointson 2 lines insource/isaaclab_newton/isaaclab_newton/actuators/actuator_pd.py. This is consistent with the other kernel launches (compute_pd_actuator,_clip_effort) in the same methods, which already useself._num_joints.Test plan
🤖 Generated with Claude Code