reproducibilityindex.ai

Bi-linear Value Networks for Multi-goal Reinforcement Learning

Authors: Zhang-Wei Hong, Ge Yang, Pulkit Agrawal

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evidence is provided on the simulated Fetch robot task-suite, and dexterous manipulation with a Shadow hand.
Researcher Affiliation	Collaboration	Zhang-Wei Hong , Ge Yang* & Pulkit Agrawal Improbable AI Lab NSF AI Institute for AI and Fundamental Interactions (IAIFI) , MIT-IBM Watson AI Lab Massachusetts Institute Technology
Pseudocode	No	No pseudocode or algorithm blocks are explicitly presented or labeled in the paper.
Open Source Code	No	We provide detailed instructions for reproducing the results in this paper in the Appendix. Please refer to Section Section A.4.
Open Datasets	Yes	All experiments in Section 5.1 happen on the standard object and dexterous manipulation tasks from the gym robotics suite (Plappert et al., 2018).
Dataset Splits	No	The paper describes training and testing splits, but does not explicitly mention or detail a validation set or its proportions.
Hardware Specification	No	We thank members of Improbable AI Lab for the helpful discussion and feedback. We are grateful to MIT Supercloud and the Lincoln Laboratory Supercomputing Center for providing HPC resources.
Software Dependencies	No	The paper mentions various algorithms and networks (e.g., DDPG, HER, SAC, TD3, MLP) but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup	Yes	Fetch We adapted the hyperparameters in (Plappert et al., 2018) for training in a single desktop. (...) num workers: 2 for Reach, 8 for Push, 16 for Pick & Place, and 20 for Slide. Batch size: 1024 Warm up rollouts: We collected 100 initial rollouts for preﬁlling the replay buffer. Training frequency: We train the agent per 2 environment steps.