Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Hierarchical Equivariant Policy via Frame Transfer
Authors: Haibo Zhao, Dian Wang, Yizhe Zhu, Xupeng Zhu, Owen Lewis Howell, Linfeng Zhao, Yaoyao Qian, Robin Walters, Robert Platt
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | HEP achieves state-of-the-art performance in complex robotic manipulation tasks, demonstrating significant improvements in both simulation and real-world settings. (Code and videos are available at project page.) |
| Researcher Affiliation | Academia | 1Northeastern University 2Robotics and Ai Institute. Correspondence to: Dian Wang <EMAIL>. |
| Pseudocode | No | The paper describes the methodology using prose and mathematical equations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | (Code and videos are available at project page.) |
| Open Datasets | Yes | To evaluate our policy, we first perform experiments in simulated environments in the RLBench (James et al., 2020) benchmark implemented using Coppelia Sim (Rohmer et al., 2013) and Py Rep (James et al., 2019). |
| Dataset Splits | Yes | Each task is trained using 100 demonstrations, more detailed task descriptions and visualizations are provided in Appendix F. We experiment in three tasks as shown in Figure 6. These tasks are challenging due to their extreme long horizon (can be divided into 6 to 9 sub-tasks) and the diverse types of manipulation involved. Evaluations are conducted in 20 trials: 10 with object placements similar to the training dataset s and 10 with unseen placements. To evaluate the generalizability of our model, we perform a one-shot experiment where the model is trained to finish a pick-place task with only one demonstration. During testing, the object is placed in unseen poses, as shown in Figure 7. The results in Table 4 demonstrate the strong generalizability of our model, achieving an 80% success rate over 20 trials. |
| Hardware Specification | No | Our real-world experiments are conducted on a UR5e robotic arm equipped with a Robotiq 2F-85 gripper and three Intel Real Sense D455 cameras as shown in Figure 10 . Demonstrations are collected using a 6-Do F 3DConnexion Space Mouse at a 10 Hz rate, logging both the visual observations (from all three cameras) and the robot s end-effector actions (position, orientation, and gripper states). |
| Software Dependencies | No | We train our models with the Adam W ((Loshchilov & Hutter, 2019)) optimizer (with a learning rate of 10 4 and weight decay of 5*10 4). We use DDPM ((Ho et al., 2020)) with 100 denoising steps for both training and evaluation. We training each tasks with 100000 iterates. In practice, we implement the T(3)-invariance in the Point Net by using the relative position to the center of each voxel, and implement the SO(2)-equivariance using escnn (Cesa et al., 2022). |
| Experiment Setup | Yes | In the simulation experiments, we we use a batch size of 16 for training. Specifically, the observation contains one step of history observation, and 3 steps of history action and the output of the denoising process is a sequence of 18 action steps. In close-loop control we use all 18 steps for training and execute 18 steps, similar to prior work ((Xian et al., 2023)). In close-loop control 18 steps and 9 steps are used for training and execution, similar to setting of (Wang et al., 2024) a. We train our models with the Adam W ((Loshchilov & Hutter, 2019)) optimizer (with a learning rate of 10 4 and weight decay of 5*10 4). We use DDPM ((Ho et al., 2020)) with 100 denoising steps for both training and evaluation. We training each tasks with 100000 iterates. |