Explicit Explore-Exploit Algorithms in Continuous State Spaces
Authors: Mikael Henaff
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then present a practical version of the algorithm using neural networks, and demonstrate its performance and sample efficiency empirically on several problems with large or continuous state spaces. (Abstract & Introduction)We now give empirical results for the Neural-E3 algorithm described in Section 4. (Section 6) |
| Researcher Affiliation | Industry | Mikael Henaff Microsoft Research mihenaff@microsoft.com |
| Pseudocode | Yes | Algorithm 1 (M, Π, n, ϵ, φ) and Algorithm 2 Update Model Set(Mt, R, φ) |
| Open Source Code | Yes | See Appendix C for experimental details and https://github.com/mbhenaff/neural-e3 for source code. |
| Open Datasets | Yes | We begin with a set of experiments on the stochastic combination lock environment described in [14], We next evaluated our approach on a maze environment, which is a modified version of the Collect domain [35], Mountain Car [32], Acrobot [50]. For the Mountain Car and Acrobot experiments, we used the OpenAI Gym [6] implementations. |
| Dataset Splits | No | The paper does not provide specific training/validation/test dataset splits. It describes running experiments in simulation environments over a number of episodes, rather than using fixed datasets with explicit splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Open AI Baselines implementation [13]' and 'Adam [25]' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | We used Adam [25] for all neural network training with a learning rate of 1e-4 and a batch size of 32. (Appendix C) |