Diffusion for World Modeling: Visual Details Matter in Atari

Authors: Eloi Alonso, Adam Jelley, Vincent Micheli, Anssi Kanervisto, Amos J. Storkey, Tim Pearce, François Fleuret

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We analyze the key design choices that are required to make diffusion suitable for world modeling, and demonstrate how improved visual details can lead to improved agent performance. DIAMOND achieves a mean human normalized score of 1.46 on the competitive Atari 100k benchmark; a new best for agents trained entirely within a world model. We further demonstrate that DIAMOND s diffusion world model can stand alone as an interactive neural game engine by training on static Counter-Strike: Global Offensive gameplay.
Researcher Affiliation Collaboration Eloi Alonso University of Geneva Adam Jelley University of Edinburgh Vincent Micheli University of Geneva Anssi Kanervisto Microsoft Research Amos Storkey University of Edinburgh Tim Pearce Microsoft Research François Fleuret University of Geneva
Pseudocode Yes We summarize the overall training procedure of DIAMOND in Algorithm 1 below.
Open Source Code Yes To foster future research on diffusion for world modeling, we release our code, agents, videos and playable world models at https://diamond-wm.github.io.
Open Datasets Yes For comprehensive evaluation of DIAMOND we use the established Atari 100k benchmark (Kaiser et al., 2019), consisting of 26 games that test a wide range of agent capabilities. ... To further demonstrate the effectiveness of our world model in isolation, we train DIAMOND s diffusion world model on 87 hours of static Counter-Strike: Global Offensive (CSGO) gameplay (Pearce and Zhu, 2022) to produce an interactive neural game engine for the popular in-game map, Dust II.
Dataset Splits No The paper mentions training and testing splits for CS:GO and Motorway driving datasets (e.g.,
Hardware Specification Yes Each run utilized around 12GB of VRAM and took approximately 2.9 days on a single Nvidia RTX 4090 (1.03 GPU years in total).
Software Dependencies No The paper does not provide specific software versions for its ancillary software dependencies (e.g.,
Experiment Setup Yes We provide architecture details, hyperparameters, and RL objectives in Appendices D, E, F, respectively. ... Table 3: Hyperparameters for DIAMOND.