Explicit Explore-Exploit Algorithms in Continuous State Spaces

Authors: Mikael Henaff

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We then present a practical version of the algorithm using neural networks, and demonstrate its performance and sample efficiency empirically on several problems with large or continuous state spaces. (Abstract & Introduction)We now give empirical results for the Neural-E3 algorithm described in Section 4. (Section 6)
Researcher Affiliation Industry Mikael Henaff Microsoft Research mihenaff@microsoft.com
Pseudocode Yes Algorithm 1 (M, Π, n, ϵ, φ) and Algorithm 2 Update Model Set(Mt, R, φ)
Open Source Code Yes See Appendix C for experimental details and https://github.com/mbhenaff/neural-e3 for source code.
Open Datasets Yes We begin with a set of experiments on the stochastic combination lock environment described in [14], We next evaluated our approach on a maze environment, which is a modified version of the Collect domain [35], Mountain Car [32], Acrobot [50]. For the Mountain Car and Acrobot experiments, we used the OpenAI Gym [6] implementations.
Dataset Splits No The paper does not provide specific training/validation/test dataset splits. It describes running experiments in simulation environments over a number of episodes, rather than using fixed datasets with explicit splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types) used for running its experiments.
Software Dependencies No The paper mentions 'Open AI Baselines implementation [13]' and 'Adam [25]' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes We used Adam [25] for all neural network training with a learning rate of 1e-4 and a batch size of 32. (Appendix C)