Improving Environment Novelty Quantification for Effective Unsupervised Environment Design
Authors: Jayden Teoh, Wenjun Li, Pradeep Varakantham
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations demonstrate that augmenting existing regret-based UED algorithms with CENIE achieves stateof-the-art performance across multiple benchmarks, underscoring the effectiveness of novelty-driven autocurricula for robust generalization. |
| Researcher Affiliation | Academia | Jayden Teoh , Wenjun Li , Pradeep Varakantham Singapore Management University {jxteoh.2023, wjli.2020, pradeepv}@smu.edu.sg |
| Pseudocode | Yes | Algorithm 1 ACCEL-CENIE Input: Level buffer size N, Component range [Kmin, Kmax], FIFO window size W, level generator G Initialize: Student policy πη, level buffer B, state-action buffer Γ, GMM parameters λΓ |
| Open Source Code | No | At the time of submitting the camera-ready version of this paper, our codebase is not yet prepared for open-sourcing. However, we have provided comprehensive implementation details to ensure replicability. |
| Open Datasets | Yes | We empirically demonstrated the effectiveness of CENIE on three distinct domains: Minigrid, Bipedal Walker, and Car Racing. Minigrid is a partially observable navigation task under discrete control with sparse rewards, while Bipedal Walker and Car Racing are partially observable continuous control tasks with dense rewards." and "The Bipedal Walker domain is modified on top of the Bipedal Walker Hardcore environment from Open AI Gym, introduced by [56] and improved by [41, 37]." and "The Car Racing domain was introduced and customized by [27]." |
| Dataset Splits | No | The paper mentions training on various domains and evaluating on 'held-out tasks' or 'testing environments', but does not explicitly define or specify a 'validation' dataset split or its proportion. |
| Hardware Specification | Yes | All of our experiments are run with a single V100 GPU or Ge Force 3090 GPU, using 10 Intel Xeon E5-2698 v4 CPUs. |
| Software Dependencies | No | We use the Py Cave [13] Python library to fit the GMM using GPU acceleration, which also provides an efficient abstraction for the Expectation-Maximization (EM) algorithm. We use the Py Torch Adapt [35] Python library to calculate the silhouette scores. |
| Experiment Setup | Yes | In this section, we provide the hyperparameters we used for both CENIE-augmented and baseline algorithms in our experiments. We employ the same set of CENIE parameters for both ACCELCENIE and PLR-CENIE. We provide all the parameters for our implementations in Table 7. |