reproducibilityindex.ai

Active Exploration for Learning Symbolic Representations

Authors: Garrett Andersen, George Konidaris

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that our algorithm outperforms random and greedy exploration policies on two different computer game domains. The ﬁrst domain is an Asteroids-inspired game with complex dynamics but basic logical structure. The second is the Treasure Game, with simpler dynamics but more complex logical structure. [...] Simulation results for the Asteroids domain. Each bar represents the average of 100 runs. The error bars represent a 99% conﬁdence interval for the mean. (a), (b), (c): The fraction of time that the agent spends on asteroids 1, 3, and 4, respectively. The greedy and random exploration policies spend signiﬁcantly more time than our algorithm exploring asteroid 1 and signiﬁcantly less time exploring asteroids 3 and 4. (d): The number of symbolic transitions that the agent has not observed (out of 115 possible). The greedy and random policies require 2-3 times as many option executions to match the performance of our algorithm. (Figure 3)
Researcher Affiliation	Collaboration	Garrett Andersen PROWLER.io Cambridge, United Kingdom garrett@prowler.io George Konidaris Department of Computer Science Brown University gdk@cs.brown.edu
Pseudocode	Yes	Algorithm 1 Fast Construction of a Distribution over Symbolic Option Models [...] Algorithm 2 Optimal Exploration
Open Source Code	No	The paper does not provide any information or links regarding open-source code for the described methodology.
Open Datasets	No	The paper introduces two custom game domains (Asteroids and Treasure Game) for experiments but does not provide access information (link, citation, repository) for the data collected within these domains, nor does it refer to them as publicly available datasets.
Dataset Splits	No	The paper describes its experimental setup including running simulations for 100 runs, but it does not specify any training, validation, or test dataset splits in the traditional sense, as the data is collected dynamically during exploration in the game environments.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory, or cloud instances).
Software Dependencies	No	The paper mentions 'pybox2d' as a physics simulator but does not provide version numbers for this or any other software dependency.
Experiment Setup	Yes	The hyperparameter settings that we use for our algorithm are given in Appendix A.