reproducibilityindex.ai

Information is Power: Intrinsic Control via Information Capture

Authors: Nicholas Rhinehart, Jenny Wang, Glen Berseth, John Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments are designed to answer the following questions: Q1: Intrinsic control capability: Does our latent visitation-based self-supervised reward signal cause the agent to stabilize partiallyobserved visual environments with dynamic entities more effectively than prior self-supervised stabilization objectives? Q2: Properties of the IC2 reward and alternative intrinsic rewards: What types of emergent behaviors does each belief-based objective described in Section 3.1 evoke? In order to answer these questions, we identiﬁed environments with the following properties (i): partial observability, (ii): dynamic entities that the agent can affect, and (iii): high-dimensional observations. Our primary results are presented in Fig. 8.
Researcher Affiliation	Collaboration	Nicholas Rhinehart UC Berkeley Jenny Wang UC Berkeley Glen Berseth UC Berkeley John D. Co-Reyes UC Berkeley Danijar Hafner University of Toronto Google Research, Brain Team Chelsea Finn Stanford University Sergey Levine UC Berkeley
Pseudocode	Yes	Algorithm 1 Intrinsic Control via Information Capture (IC2)
Open Source Code	No	The paper states, 'We present videos on our project page: https://sites.google.com/view/ic2/home,' but it does not explicitly mention that the source code for the methodology is available at this link or elsewhere.
Open Datasets	No	The paper describes the 'Two Room Environment', 'Viz Doom Defend The Center environment', and 'One Room Capture3D environment' and mentions the 'Mini World framework' but does not provide specific access information (links, DOIs, repositories, or formal citations) for any public or open datasets used.
Dataset Splits	No	The paper mentions training and evaluation on environments but does not specify any explicit training, validation, or test dataset splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies	No	The paper states 'We implemented an LSSM in Py Torch [55]' but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	We represent each policy π(at\|vposterior,t) as a two-layer fully-connected MLP with 128 units. ... We perform evaluation at 5e6 environment steps with 50 policy rollouts per random seed, with 3 random seeds for each method (150 rollouts total).