reproducibilityindex.ai

NovelD: A Simple yet Effective Exploration Criterion

Authors: Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Novel D on three very challenging exploration environments: Mini Grid [13], Net Hack [34] and Atari games [9]. Novel D manages to solve all the static environments in Mini Grid within 120M environment steps, without any curriculum learning. In comparison, the previous SOTA only solves 50% of them. Novel D also achieves SOTA on multiple tasks in Net Hack, a rogue-like game that contains more challenging procedurally-generated environments. In multiple Atari games (e.g., Monte Zuma s Revenge, Venture, Gravitar), Novel D outperforms RND.
Researcher Affiliation	Collaboration	1University of California, Berkeley 2Unveristy of California, San Diego 3Tsinghua University 4Facebook AI Research 5Shanghai Qi Zhi Institue
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github. com/tianjunz/Novel D.
Open Datasets	Yes	We evaluate Novel D on three very challenging exploration environments: Mini Grid [13], Net Hack [34] and Atari games [9].
Dataset Splits	No	The paper does not provide specific train/validation/test dataset splits (e.g., percentages or counts). It mentions running experiments across four seeds and in 32 random initialized environments for Mini Grid, but not explicit data partitioning for validation.
Hardware Specification	No	The paper mentions 'computation limit' but does not provide specific hardware details such as GPU/CPU models, processors, or memory amounts used for running experiments.
Software Dependencies	No	The paper mentions using PPO as the base RL algorithm, along with CNN and RNN based networks, but it does not specify version numbers for any software libraries, frameworks, or dependencies.
Experiment Setup	Yes	Novel D manages to solve all the static environments in Mini Grid within 120M environment steps, without any curriculum learning. Results show that = 0.5 and β = 0 works the best.