Classical Planning with Simulators: Results on the Atari Video Games

Authors: Nir Lipovetzky, Miquel Ramirez, Hector Geffner

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The empirical results over 54 Atari games show that the simplest such algorithm performs at the level of UCT, the state-of-the-art planning method in this domain, and suggest the potential of width-based methods for planning with simulators when factored, compact action models are not available.
Researcher Affiliation Academia Nir Lipovetzky The University of Melbourne Melbourne, Australia nir.lipovetzky@unimelb.edu.au Miquel Ramirez Australian National University Canberra, Australia miquel.ramirez@anu.edu.au Hector Geffner ICREA & U. Pompeu Fabra Barcelona, Spain urlhector.geffner@upf.edu
Pseudocode No The paper describes the Iterated Width (IW) algorithm and its variations in detail using text, but it does not include a formally labeled "Pseudocode" or "Algorithm" block.
Open Source Code No The paper does not contain any explicit statement about releasing its source code or provide a link to a code repository for the methodology described.
Open Datasets Yes We tested IW(1) and 2BFS over 54 of the 55 different games considered in [Bellemare et al., 2013], from now on abbreviated as BNVB. The Arcade Learning Environment (ALE) provides a challenging platform for evaluating general, domain-independent AI planners and learners through a convenient interface to hundreds of Atari 2600 games [Bellemare et al., 2013].
Dataset Splits No The paper describes the experimental setup in terms of budget and frames for lookahead search within the Atari games environment. However, it does not specify any training, validation, or test dataset splits in the traditional machine learning sense, as the evaluation is done directly on the game environments.
Hardware Specification Yes Experiments were run on a cluster, where each computing node consists of a 6-core Intel Xeon E5-2440, with 2.4 GHz clock speed, with 64 GBytes of RAM installed.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python, libraries, frameworks). It mentions using the Arcade Learning Environment, but without version details for the software stack.
Experiment Setup Yes For the experiments below, we added two simple variations to IW(1) and 2BFS. First, in the breadth-first search underlying IW(1), we generate the children in random order. Second, a discount factor γ = 0.995 is used in both algorithms for discounting future rewards like in UCT. Our experimental setup follows theirs except that a maximum budget of 150, 000 simulated frames is applied to IW(1), 2BFS, and UCT. IW(1) and 2BFS are limited to search up to a depth of 1, 500 frames and up to 150, 000 frames per root branch.