reproducibilityindex.ai

Generalization through Diversity: Improving Unsupervised Environment Design

Authors: Wenjun Li, Pradeep Varakantham, Dexun Li

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate the versatility and effectiveness of our method in comparison to multiple leading approaches for unsupervised environment design on three distinct benchmark problems used in literature.
Researcher Affiliation	Academia	Wenjun Li , Pradeep Varakantham , Dexun Li Singapore Management University {wjli.2020, pradeepv, dexunli.2019}@smu.edu.sg
Pseudocode	Yes	The complete procedure of DIPLR is presented in Algorithm 1.
Open Source Code	No	The paper does not contain an explicit statement about releasing the source code for the methodology or a link to a code repository.
Open Datasets	Yes	We conduct experiments and empirically demonstrate the effectiveness and generality of DIPLR on three popular yet highly distinct UPOMDP domains, Minigrid, Bipedal-Walker and Car-Racing.
Dataset Splits	No	The paper does not explicitly provide specific training/validation/test dataset splits (e.g., percentages, sample counts, or explicit standard split citations with authors/year) for reproducibility.
Hardware Specification	No	The paper does not specify any particular hardware components (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions using Proximal Policy Optimization (PPO) and a Wasserstein distance solver from [Flamary et al., 2021], but it does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	We train all the student agents for 30k PPO updates ( 250M steps)... We could assign different weights to diversity and regret by letting the replay probability Preplay = ρ PD + (1 ρ) PR, where PD and PR are the prioritization of diversity and regret respectively, and ρ is the tuning parameter.