reproducibilityindex.ai

Meta-Evolve: Continuous Robot Evolution for One-to-many Policy Transfer

Authors: Xingyu Liu, Deepak Pathak, Ding Zhao

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments have shown that our method is able to improve the efficiency of one-to-three transfer of manipulation policy by up to 3.2 and one-to-six transfer of agile locomotion policy by 2.4 in terms of simulation cost over the baseline of launching multiple independent one-to-one policy transfers.
Researcher Affiliation	Academia	Xingyu Liu, Deepak Pathak, Ding Zhao Carnegie Mellon University {xingyul3,dpathak,dingzhao}@andrew.cmu.edu
Pseudocode	Yes	Algorithm 1 Meta-Evolve Algorithm 2 Determination of Evolution Tree and Meta Robots
Open Source Code	No	Supplementary videos available at the project website: https://sites.google.com/view/meta-evolve. The paper explicitly mentions 'videos' at the project website but does not state that source code for the methodology is available there or elsewhere.
Open Datasets	Yes	We showcase our Meta-Evolve on three Hand Manipulation Suite manipulation tasks (Rajeswaran et al., 2018)... The source robot is the Ant-v2 robot used in Mu Jo Co Gym (Brockman et al., 2016)... The source expert policy is trained by learning from the human hand demonstrations in Dex YCB dataset (Chao et al., 2021).
Dataset Splits	No	The paper states a 'Success Rate Threshold for Moving to the Next Training Phase' of 66.7% and aims to 'reach 80% success rate on all three target robots'. However, it does not specify explicit dataset splits (e.g., 80/10/10) for training, validation, and testing data.
Hardware Specification	No	The paper mentions using PyTorch as the deep learning framework, NPG as the RL algorithm, and MuJoCo as the physics simulation engine. However, it does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	Yes	We use Py Torch (Paszke et al., 2019) as our deep learning framework and NPG (Rajeswaran et al., 2017) as the RL algorithm in all manipulation policy transfer and agile locomotion transfer experiments. We used Mu Jo Co (Todorov et al., 2012) as the physics simulation engine.
Experiment Setup	Yes	Hyperparameter Selection. We present the hyperparameters of our robot evolution and policy optimization in Table 4. Table 4 lists specific values for RL Discount Factor γ, GAE, NPG Step Size, Policy Network Hidden Layer Sizes, Value Network Hidden Layer Sizes, Simulation Epoch Length, RL Traning Batch Size, Evolution Progression Step Size ξ, Number of Sampled Evolution Parameter Vectors for Jacobian Estimation in HERD Runs, Evolution Direction Weighting Factor λ, Sample Range Shrink Ratio, Success Rate Threshold for Moving to the Next Training Phase.