reproducibilityindex.ai

Reparameterized Policy Learning for Multimodal Trajectory Optimization

Authors: Zhiao Huang, Litian Liang, Zhan Ling, Xuanlin Li, Chuang Gan, Hao Su

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results demonstrate that our method can help agents evade local optima in tasks with dense rewards and solve challenging sparse-reward environments by incorporating an object-centric intrinsic reward.
Researcher Affiliation	Collaboration	1UC San Diego 2MIT-IBM Watson AI Lab 3UMass Amherst.
Pseudocode	Yes	We describe the whole algorithm in Alg. 1 and implementation details in Appendix A.
Open Source Code	Yes	Code and supplementary materials are available on the project page https: //haosulab.github.io/RPG/
Open Datasets	Yes	We take 8 representative environments from standard RL benchmarks, including 2 table-top environments from Meta World (Yu et al., 2020), 2 dexterous hand manipulation tasks from Rajeswaran et al. (2017), 1 navigation problems from Nachum et al. (2018b), and 2 articulated object manipulation from Mani Skill (Mu et al., 2021).
Dataset Splits	No	The paper evaluates performance within dynamic reinforcement learning environments and describes interaction-based data collection, but it does not specify explicit training, validation, or test dataset splits in terms of percentages or counts from a static dataset.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models, memory specifications, or cloud computing instances.
Software Dependencies	No	The paper mentions using 'pytorch' but does not specify its version number or the versions of any other key software dependencies required to reproduce the experiments.
Experiment Setup	Yes	The hyperparameters for training the network are listed in Table 1.