Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning

Authors: Shakir Mohamed, Danilo Jimenez Rezende

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that empowerment-based behaviours obtained using variational information maximisation match those using the exact computation. We then apply our algorithms to a broad range of high-dimensional problems for which it is not possible to compute the exact solution, but for which we are able to act according to empowerment learning directly from pixel information.
Researcher Affiliation Industry Google Deep Mind, London {shakir, danilor}@google.com
Pseudocode Yes Algorithm 1: Stochastic Variational Information Maximisation for Empowerment
Open Source Code No The paper provides links to YouTube videos demonstrating results, but no explicit statement or link for the source code of the methodology itself.
Open Datasets No The paper describes environments like "room environment" and "maze environment" and uses "pixel information (on 20x20 images)", and references a "3D physics simulation [29]" but does not provide access information (link, citation, or repository) for any publicly available or open dataset used for training.
Dataset Splits No The paper does not explicitly mention training, validation, or test dataset splits, percentages, or sample counts.
Hardware Specification No The paper mentions using "GPUs" for computation, but does not provide specific hardware details such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions "convolutional neural network" and "Adagrad" but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes For all these experiments we used a horizon of K = 5. ... The agent may have other actions, such as picking up a key or laying down a brick. There are no external rewards available and the agent must reason purely using visual (pixel) information. ... The state is the position, velocity and angular momentum of the agent and the predator, and the action is a 2D force vector.