VIME: Variational Information Maximizing Exploration
Authors: Rein Houthooft, Xi Chen, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards. |
| Researcher Affiliation | Collaboration | UC Berkeley, Department of Electrical Engineering and Computer Sciences Ghent University imec, Department of Information Technology Open AI |
| Pseudocode | Yes | Algorithm 1: Variational Information Maximizing Exploration (VIME) |
| Open Source Code | No | The paper does not provide any statement or link indicating that its source code is publicly available. |
| Open Datasets | Yes | All experiments make use of the rllab [15] benchmark code base and the complementary continuous control tasks suite. The following tasks are part of the experimental setup: Cart Pole (S R4, A R1), Cart Pole Swingup (S R4, A R1), Double Pendulum (S R6, A R1), Mountain Car (S R3, A R1), locomotion tasks Half Cheetah (S R20, A R6), Walker2D (S R20, A R6), and the hierarchical task Swimmer Gather (S R33, A R2). |
| Dataset Splits | No | The paper does not specify traditional training/validation/test dataset splits. It describes continuous interaction with environments, not static datasets split into these partitions. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the "rllab benchmark code base" but does not specify any software versions for libraries, frameworks, or programming languages. |
| Experiment Setup | No | The paper states that "The exact setup is described in the Appendix." which is not provided in the main text. While it discusses the hyperparameter η, it does not provide specific values used for the main experimental results within the main text. |