reproducibilityindex.ai

OMNI: Open-endedness via Models of human Notions of Interestingness

Authors: Jenny Zhang, Joel Lehman, Kenneth Stanley, Jeff Clune

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate OMNI on three challenging domains, Crafter (Hafner, 2021) (a 2D version of Minecraft), Baby AI (Chevalier-Boisvert et al., 2018) (a 2D grid world for grounded language learning), and AI2-THOR (Kolve et al., 2017) (a 3D photo-realistic embodied robotics environment). OMNI outperforms baselines based on uniform task sampling or learning progress alone.
Researcher Affiliation	Collaboration	Jenny Zhang1,2 Joel Lehman3 Kenneth Stanley4 Jeff Clune1,2,5 1Department of Computer Science, University of British Columbia 2Vector Institute 3Stochastic Labs 4Maven 5Canada CIFAR AI Chair
Pseudocode	Yes	Algorithm 1 OMNI Algorithm
Open Source Code	No	The paper mentions 'Project website: https://www.jennyzhangzt.com/omni/', but does not explicitly state that source code for the methodology is provided there, nor is it a direct link to a code repository.
Open Datasets	Yes	We evaluate OMNI on three challenging domains, Crafter (Hafner, 2021) (a 2D version of Minecraft), Baby AI (Chevalier-Boisvert et al., 2018) (a 2D grid world for grounded language learning), and AI2-THOR (Kolve et al., 2017) (a 3D photo-realistic embodied robotics environment).
Dataset Splits	No	The paper does not provide specific percentages or counts for training, validation, or test dataset splits. It mentions 'validation' in the context of the agent's learning process rather than a static dataset split.
Hardware Specification	Yes	Each experiment takes about 33 hrs for Crafter and 60 hrs for Baby AI on a 24GB NVIDIA A10 GPU with 30 virtual CPUs.
Software Dependencies	No	The paper mentions software like PPO, GRU, LSTM, and refers to GPT-3 and GPT-4 APIs, but does not provide specific version numbers for these software components or any other libraries used.
Experiment Setup	Yes	Appendices L, M, and N provide detailed tables listing specific hyperparameters for the training process, including 'Discount factor', 'Learning rate', 'PPO clip threshold', 'GAE lambda', 'Entropy coefficient', 'Batch size', 'Epochs', and 'Max episode length' for Crafter, Baby AI, and AI2-THOR environments.