The Information Geometry of Unsupervised Reinforcement Learning

Authors: Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify this result with a simple experiment. We randomly generate tabular MDPs and learn 100 (possibly redundant) skills using MISL. As shown in Fig. 4, the number of unique skills for an MDP is never greater than the number of states in that MDP, supporting Lemma 6.3.
Researcher Affiliation Collaboration Benjamin Eysenbach1 2 Ruslan Salakhutdinov1 Sergey Levine2 3 1Carnegie Mellon University, 2Google Brain, 3UC Berkeley
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found in the paper.
Open Source Code Yes Code to reproduce: https://github.com/ben-eysenbach/info_geometry/blob/main/experiments.ipynb
Open Datasets No The paper states: 'We randomly generate tabular MDPs and learn 100 (possibly redundant) skills using MISL.' This indicates the authors generated their own data rather than using a publicly available or open dataset, and no access information for the generated data is provided.
Dataset Splits No The paper describes generating random MDPs for experiments but does not provide specific details on how this data was split into training, validation, and test sets (e.g., percentages, sample counts, or a description of a predefined split).
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments were provided.
Software Dependencies No No specific software dependencies, libraries, or frameworks with version numbers were mentioned in the paper.
Experiment Setup No The paper states 'We randomly generate tabular MDPs and learn 100 (possibly redundant) skills using MISL.' but does not provide specific experimental setup details such as hyperparameter values, optimizer settings, or other concrete configuration parameters in the main text.