Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Actionable Representations with Goal Conditioned Policies

Authors: Dibya Ghosh, Abhishek Gupta, Sergey Levine

ICLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method on a number of simulated environments, and compare it to prior methods for representation learning, exploration, and hierarchical reinforcement learning.
Researcher Affiliation Academia Dibya Ghosh, Abhishek Gupta, & Sergey Levine Department of Electrical Engineering and Computer Science University of California, Berkeley Berkeley, CA 94703, USA
Pseudocode No No pseudocode or algorithm blocks are present.
Open Source Code No No information about open-source code availability is provided.
Open Datasets No We study six simulated environments as illustrated in Figure 4: 2D navigation tasks in two settings, wheeled locomotion tasks in two settings, legged locomotion, and object pushing with a robotic gripper.
Dataset Splits Yes holding out 20% of the trajectories as a validation set.
Hardware Specification No No specific hardware details (like GPU/CPU models) are provided. "computational resources from Amazon" is too vague.
Software Dependencies No The paper mentions algorithms and optimizers (TRPO, Adam) but does not provide specific software dependencies with version numbers.
Experiment Setup Yes The mean, ยตฮธ( , ) is a fully-connected neural network which takes in the state and the desired goal state as a concatenated vector, and has three hidden layers containing 150, 100, and 50 units respectively. ฮฃ is a learned diagonal covariance matrix, and is initially set to ฮฃ = I.