Mastering Atari Games with Limited Data
Authors: Weirui Ye, Shaohuai Liu, Thanard Kurutach, Pieter Abbeel, Yang Gao
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method achieves 190.4% mean human performance and 116.0% median performance on the Atari 100k benchmark with only two hours of real-time game experience and outperforms the state SAC in some tasks on the DMControl 100k benchmark. |
| Researcher Affiliation | Academia | Weirui Ye Shaohuai Liu Thanard Kurutach Pieter Abbeel Yang Gao Tsinghua University, UC Berkeley, Shanghai Qi Zhi Institute {ywr20, liush20}@mails.tsinghua.edu.cn, gaoyangiiis@tsinghua.edu.cn {thanard.kurutach, pabbeel}@berkeley.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We implement our algorithm in an easy-to-understand manner and it is available at https://github.com/Ye WR/Efficient Zero. |
| Open Datasets | Yes | More specifically, we use the Atari 100k benchmark. Intuitively, this benchmark asks the agent to learn to play Atari games within two hours of real-world game time. |
| Dataset Splits | Yes | We split this dataset into a training set and a validation set. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for ancillary software dependencies used in the experiments. |
| Experiment Setup | Yes | The benchmark allows the agent to interact with 100 thousand environment steps, i.e. 400 thousand frames due to a frameskip of 4, with each environment. |