Active Exploration for Learning Symbolic Representations
Authors: Garrett Andersen, George Konidaris
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that our algorithm outperforms random and greedy exploration policies on two different computer game domains. The first domain is an Asteroids-inspired game with complex dynamics but basic logical structure. The second is the Treasure Game, with simpler dynamics but more complex logical structure. [...] Simulation results for the Asteroids domain. Each bar represents the average of 100 runs. The error bars represent a 99% confidence interval for the mean. (a), (b), (c): The fraction of time that the agent spends on asteroids 1, 3, and 4, respectively. The greedy and random exploration policies spend significantly more time than our algorithm exploring asteroid 1 and significantly less time exploring asteroids 3 and 4. (d): The number of symbolic transitions that the agent has not observed (out of 115 possible). The greedy and random policies require 2-3 times as many option executions to match the performance of our algorithm. (Figure 3) |
| Researcher Affiliation | Collaboration | Garrett Andersen PROWLER.io Cambridge, United Kingdom garrett@prowler.io George Konidaris Department of Computer Science Brown University gdk@cs.brown.edu |
| Pseudocode | Yes | Algorithm 1 Fast Construction of a Distribution over Symbolic Option Models [...] Algorithm 2 Optimal Exploration |
| Open Source Code | No | The paper does not provide any information or links regarding open-source code for the described methodology. |
| Open Datasets | No | The paper introduces two custom game domains (Asteroids and Treasure Game) for experiments but does not provide access information (link, citation, repository) for the data collected within these domains, nor does it refer to them as publicly available datasets. |
| Dataset Splits | No | The paper describes its experimental setup including running simulations for 100 runs, but it does not specify any training, validation, or test dataset splits in the traditional sense, as the data is collected dynamically during exploration in the game environments. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory, or cloud instances). |
| Software Dependencies | No | The paper mentions 'pybox2d' as a physics simulator but does not provide version numbers for this or any other software dependency. |
| Experiment Setup | Yes | The hyperparameter settings that we use for our algorithm are given in Appendix A. |