Planning from Pixels in Atari with Learned Symbolic Representations
Authors: Andrea Dittadi, Frederik K. Drachmann, Thomas Bolander4941-4949
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run a large-scale evaluation on Atari 2600 where we compare the performance of Rollout IW with our learned representations to Rollout IW with B-PROST features. |
| Researcher Affiliation | Academia | Andrea Dittadi,* Frederik K. Drachmann,* Thomas Bolander Technical University of Denmark, Copenhagen, Denmark adit@dtu.dk, fdrachmann@hotmail.dk, tobo@dtu.dk |
| Pseudocode | No | The paper describes algorithms but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code available at https://github.com/fred5577/VAE-IW |
| Open Datasets | Yes | The Arcade Learning Environment (ALE) (Bellemare et al. 2013) provides an interface to Atari 2600 video games, and has been widely used in recent years as a benchmark for reinforcement learning and planning algorithms. |
| Dataset Splits | Yes | To train the VAEs, we collected 15,000 frames by running Rollout IW(1) with B-PROST features, and split them into a training and validation set of 14,250 and 750 images. |
| Hardware Specification | No | No specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments are provided in the paper. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as libraries or programming languages used in the implementation. |
| Experiment Setup | Yes | Following previous work (Lipovetzky and Geffner 2012; Bandres, Bonet, and Geffner 2018), we used a frame skip of 15 and a planning budget of 0.5s per time step. We set an additional limit of 15,000 executed actions for each run, to prevent runs from lasting too long (note that this constraint is only applied to our method). And Based on the performance on a few selected domains, we chose two representative settings, with latent space size 15 15 20 and 4 4 200 (see the Appendix for further details). Also varying β {10 4, 10 3} in Ablation Studies. |