Planning from Pixels in Atari with Learned Symbolic Representations

Authors: Andrea Dittadi, Frederik K. Drachmann, Thomas Bolander4941-4949

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We run a large-scale evaluation on Atari 2600 where we compare the performance of Rollout IW with our learned representations to Rollout IW with B-PROST features.
Researcher Affiliation Academia Andrea Dittadi,* Frederik K. Drachmann,* Thomas Bolander Technical University of Denmark, Copenhagen, Denmark adit@dtu.dk, fdrachmann@hotmail.dk, tobo@dtu.dk
Pseudocode No The paper describes algorithms but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code available at https://github.com/fred5577/VAE-IW
Open Datasets Yes The Arcade Learning Environment (ALE) (Bellemare et al. 2013) provides an interface to Atari 2600 video games, and has been widely used in recent years as a benchmark for reinforcement learning and planning algorithms.
Dataset Splits Yes To train the VAEs, we collected 15,000 frames by running Rollout IW(1) with B-PROST features, and split them into a training and validation set of 14,250 and 750 images.
Hardware Specification No No specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments are provided in the paper.
Software Dependencies No The paper does not provide specific version numbers for software dependencies such as libraries or programming languages used in the implementation.
Experiment Setup Yes Following previous work (Lipovetzky and Geffner 2012; Bandres, Bonet, and Geffner 2018), we used a frame skip of 15 and a planning budget of 0.5s per time step. We set an additional limit of 15,000 executed actions for each run, to prevent runs from lasting too long (note that this constraint is only applied to our method). And Based on the performance on a few selected domains, we chose two representative settings, with latent space size 15 15 20 and 4 4 200 (see the Appendix for further details). Also varying β {10 4, 10 3} in Ablation Studies.