OMNI: Open-endedness via Models of human Notions of Interestingness
Authors: Jenny Zhang, Joel Lehman, Kenneth Stanley, Jeff Clune
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate OMNI on three challenging domains, Crafter (Hafner, 2021) (a 2D version of Minecraft), Baby AI (Chevalier-Boisvert et al., 2018) (a 2D grid world for grounded language learning), and AI2-THOR (Kolve et al., 2017) (a 3D photo-realistic embodied robotics environment). OMNI outperforms baselines based on uniform task sampling or learning progress alone. |
| Researcher Affiliation | Collaboration | Jenny Zhang1,2 Joel Lehman3 Kenneth Stanley4 Jeff Clune1,2,5 1Department of Computer Science, University of British Columbia 2Vector Institute 3Stochastic Labs 4Maven 5Canada CIFAR AI Chair |
| Pseudocode | Yes | Algorithm 1 OMNI Algorithm |
| Open Source Code | No | The paper mentions 'Project website: https://www.jennyzhangzt.com/omni/', but does not explicitly state that source code for the methodology is provided there, nor is it a direct link to a code repository. |
| Open Datasets | Yes | We evaluate OMNI on three challenging domains, Crafter (Hafner, 2021) (a 2D version of Minecraft), Baby AI (Chevalier-Boisvert et al., 2018) (a 2D grid world for grounded language learning), and AI2-THOR (Kolve et al., 2017) (a 3D photo-realistic embodied robotics environment). |
| Dataset Splits | No | The paper does not provide specific percentages or counts for training, validation, or test dataset splits. It mentions 'validation' in the context of the agent's learning process rather than a static dataset split. |
| Hardware Specification | Yes | Each experiment takes about 33 hrs for Crafter and 60 hrs for Baby AI on a 24GB NVIDIA A10 GPU with 30 virtual CPUs. |
| Software Dependencies | No | The paper mentions software like PPO, GRU, LSTM, and refers to GPT-3 and GPT-4 APIs, but does not provide specific version numbers for these software components or any other libraries used. |
| Experiment Setup | Yes | Appendices L, M, and N provide detailed tables listing specific hyperparameters for the training process, including 'Discount factor', 'Learning rate', 'PPO clip threshold', 'GAE lambda', 'Entropy coefficient', 'Batch size', 'Epochs', and 'Max episode length' for Crafter, Baby AI, and AI2-THOR environments. |