Information Prioritization through Empowerment in Visual Model-based RL

Authors: Homanga Bharadhwaj, Mohammad Babaeizadeh, Dumitru Erhan, Sergey Levine

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the approach on a suite of vision-based robot control tasks with natural video backgrounds, and show that the proposed prioritized information objective outperforms state-of-the-art model based RL approaches with higher sample efficiency and episodic returns.
Researcher Affiliation Collaboration Homanga Bharadhwaj Carnegie Mellon University Mohammad Babaeizadeh Google Research, Brain Team Dumitru Erhan Google Research, Brain Team Sergey Levine Google Research, Brain Team University of California Berkeley
Pseudocode Yes Algorithm 1: Information Prioritization in Visual Model-based RL (Info Power)
Open Source Code No Please refer to the website for a summary and qualitative visualization results https://sites.google.com/view/information-empowerment. The linked website states 'Code Coming Soon', indicating the code is not yet available.
Open Datasets Yes We perform experiments with modified Deep Mind Control Suite environments (Tassa et al., 2018), with natural video distractors from ILSVRC dataset (Russakovsky et al., 2015) in the background.
Dataset Splits No The paper mentions 'We use 200 videos during training, and reserve 50 videos for testing' for the ILSVRC dataset, but it does not specify a validation dataset split.
Hardware Specification Yes We implement our approach with Tensor Flow 2 and use a single Nvidia V100 GPU and 10 CPU cores for each training run.
Software Dependencies No The paper mentions 'Tensor Flow 2' but does not provide specific version numbers for TensorFlow or any other software libraries used.
Experiment Setup Yes We use ADAM optimizer, with learning rate of 6e-4 for the latent-state space model, and 8e-5 for the value function and policy optimization. The hyper-parameter c0 for the prioritized information constraint is set to 1000... The encoder consists of 4 convolutional layers with kernel size 4 and channel numbers 32, 65, 128, 256.