reproducibilityindex.ai

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning

Authors: Shakir Mohamed, Danilo Jimenez Rezende

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that empowerment-based behaviours obtained using variational information maximisation match those using the exact computation. We then apply our algorithms to a broad range of high-dimensional problems for which it is not possible to compute the exact solution, but for which we are able to act according to empowerment learning directly from pixel information.
Researcher Affiliation	Industry	Google Deep Mind, London {shakir, danilor}@google.com
Pseudocode	Yes	Algorithm 1: Stochastic Variational Information Maximisation for Empowerment
Open Source Code	No	The paper provides links to YouTube videos demonstrating results, but no explicit statement or link for the source code of the methodology itself.
Open Datasets	No	The paper describes environments like "room environment" and "maze environment" and uses "pixel information (on 20x20 images)", and references a "3D physics simulation [29]" but does not provide access information (link, citation, or repository) for any publicly available or open dataset used for training.
Dataset Splits	No	The paper does not explicitly mention training, validation, or test dataset splits, percentages, or sample counts.
Hardware Specification	No	The paper mentions using "GPUs" for computation, but does not provide specific hardware details such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions "convolutional neural network" and "Adagrad" but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	For all these experiments we used a horizon of K = 5. ... The agent may have other actions, such as picking up a key or laying down a brick. There are no external rewards available and the agent must reason purely using visual (pixel) information. ... The state is the position, velocity and angular momentum of the agent and the predator, and the action is a 2D force vector.