Deep Surrogate Assisted Generation of Environments

Authors: Varun Bhatt, Bryon Tjanaka, Matthew Fontaine, Stefanos Nikolaidis

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Results in two benchmark domains show that DSAGE significantly outperforms existing QD environment generation algorithms in discovering collections of environments that elicit diverse behaviors of a state-of-the-art RL agent and a planning agent. Our source code and videos are available at https://dsagepaper.github.io/
Researcher Affiliation Academia Varun Bhatt University of Southern California Los Angeles, CA vsbhatt@usc.edu Bryon Tjanaka University of Southern California Los Angeles, CA tjanaka@usc.edu Matthew C. Fontaine University of Southern California Los Angeles, CA mfontain@usc.edu Stefanos Nikolaidis University of Southern California Los Angeles, CA nikolaid@usc.edu
Pseudocode Yes Algorithm 1: Deep Surrogate Assisted Generation of Environments (DSAGE)
Open Source Code Yes Our source code and videos are available at https://dsagepaper.github.io/
Open Datasets Yes We test our algorithms in two benchmark domains from prior work: a Maze domain [82, 3, 4] with a trained ACCEL agent [4] and a Mario domain [83, 16] with an A* agent [22]. [...] The Maze domain is based on the MiniGrid environment [82]. [...] The Mario domain is based on the Mario AI Framework [83, 94].
Dataset Splits No The paper does not specify explicit training/validation/test dataset splits for the dynamically generated dataset 'D' used by the DSAGE algorithm or for the training of the surrogate model during the main experimental runs. It mentions creating a 'combined dataset' for post-hoc evaluation of surrogate models, but not for the primary experimental setup.
Hardware Specification Yes One of the GPUs used in the experiments was awarded by the NVIDIA Academic Hardware Grant. [...] All experiments were run on computers with Intel Core i9-9900K and NVIDIA GeForce RTX 2080 Ti GPUs.
Software Dependencies Yes Our implementation is in Python 3.8 with PyTorch 1.10. We use pyribs [99] for QD optimization.
Experiment Setup Yes We train the CNN surrogate models for 100 epochs using the Adam optimizer [100] with a learning rate of 0.001. The Adam optimizer s hyperparameters are set to β1 = 0.9, β2 = 0.999, = 10−8, and weight decay is set to 0.0. We use a batch size of 32 for the surrogate model. [...] For MAP-Elites, we use a batch size of 20, and for CMA-ME, we use a batch size of 10. [...] the inner loop for DSAGE is run for Nexploit = 1000 iterations.