Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Information Prioritization through Empowerment in Visual Model-based RL

Authors: Homanga Bharadhwaj, Mohammad Babaeizadeh, Dumitru Erhan, Sergey Levine

ICLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the approach on a suite of vision-based robot control tasks with natural video backgrounds, and show that the proposed prioritized information objective outperforms state-of-the-art model based RL approaches with higher sample efficiency and episodic returns.
Researcher Affiliation	Collaboration	Homanga Bharadhwaj Carnegie Mellon University Mohammad Babaeizadeh Google Research, Brain Team Dumitru Erhan Google Research, Brain Team Sergey Levine Google Research, Brain Team University of California Berkeley
Pseudocode	Yes	Algorithm 1: Information Prioritization in Visual Model-based RL (Info Power)
Open Source Code	No	Please refer to the website for a summary and qualitative visualization results https://sites.google.com/view/information-empowerment. The linked website states 'Code Coming Soon', indicating the code is not yet available.
Open Datasets	Yes	We perform experiments with modiﬁed Deep Mind Control Suite environments (Tassa et al., 2018), with natural video distractors from ILSVRC dataset (Russakovsky et al., 2015) in the background.
Dataset Splits	No	The paper mentions 'We use 200 videos during training, and reserve 50 videos for testing' for the ILSVRC dataset, but it does not specify a validation dataset split.
Hardware Specification	Yes	We implement our approach with Tensor Flow 2 and use a single Nvidia V100 GPU and 10 CPU cores for each training run.
Software Dependencies	No	The paper mentions 'Tensor Flow 2' but does not provide specific version numbers for TensorFlow or any other software libraries used.
Experiment Setup	Yes	We use ADAM optimizer, with learning rate of 6e-4 for the latent-state space model, and 8e-5 for the value function and policy optimization. The hyper-parameter c0 for the prioritized information constraint is set to 1000... The encoder consists of 4 convolutional layers with kernel size 4 and channel numbers 32, 65, 128, 256.