Deep Surrogate Assisted Generation of Environments
Authors: Varun Bhatt, Bryon Tjanaka, Matthew Fontaine, Stefanos Nikolaidis
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Results in two benchmark domains show that DSAGE significantly outperforms existing QD environment generation algorithms in discovering collections of environments that elicit diverse behaviors of a state-of-the-art RL agent and a planning agent. Our source code and videos are available at https://dsagepaper.github.io/ |
| Researcher Affiliation | Academia | Varun Bhatt University of Southern California Los Angeles, CA vsbhatt@usc.edu Bryon Tjanaka University of Southern California Los Angeles, CA tjanaka@usc.edu Matthew C. Fontaine University of Southern California Los Angeles, CA mfontain@usc.edu Stefanos Nikolaidis University of Southern California Los Angeles, CA nikolaid@usc.edu |
| Pseudocode | Yes | Algorithm 1: Deep Surrogate Assisted Generation of Environments (DSAGE) |
| Open Source Code | Yes | Our source code and videos are available at https://dsagepaper.github.io/ |
| Open Datasets | Yes | We test our algorithms in two benchmark domains from prior work: a Maze domain [82, 3, 4] with a trained ACCEL agent [4] and a Mario domain [83, 16] with an A* agent [22]. [...] The Maze domain is based on the MiniGrid environment [82]. [...] The Mario domain is based on the Mario AI Framework [83, 94]. |
| Dataset Splits | No | The paper does not specify explicit training/validation/test dataset splits for the dynamically generated dataset 'D' used by the DSAGE algorithm or for the training of the surrogate model during the main experimental runs. It mentions creating a 'combined dataset' for post-hoc evaluation of surrogate models, but not for the primary experimental setup. |
| Hardware Specification | Yes | One of the GPUs used in the experiments was awarded by the NVIDIA Academic Hardware Grant. [...] All experiments were run on computers with Intel Core i9-9900K and NVIDIA GeForce RTX 2080 Ti GPUs. |
| Software Dependencies | Yes | Our implementation is in Python 3.8 with PyTorch 1.10. We use pyribs [99] for QD optimization. |
| Experiment Setup | Yes | We train the CNN surrogate models for 100 epochs using the Adam optimizer [100] with a learning rate of 0.001. The Adam optimizer s hyperparameters are set to β1 = 0.9, β2 = 0.999, = 10−8, and weight decay is set to 0.0. We use a batch size of 32 for the surrogate model. [...] For MAP-Elites, we use a batch size of 20, and for CMA-ME, we use a batch size of 10. [...] the inner loop for DSAGE is run for Nexploit = 1000 iterations. |