reproducibilityindex.ai

Dynamic Bottleneck for Robust Self-Supervised Exploration

Authors: Chenjia Bai, Lingxiao Wang, Lei Han, Animesh Garg, Jianye Hao, Peng Liu, Zhaoran Wang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the proposed method on Atari suits with dynamics-irrelevant noises. Our experiments show that exploration with DB bonus outperforms several state-of-the-art exploration methods in noisy environments. We evaluate SSE-DB on Atari games. We conduct experiments to compare the following methods.
Researcher Affiliation	Collaboration	Chenjia Bai Harbin Institute of Technology bai_chenjia@stu.hit.edu.cn Lingxiao Wang Northwestern University lingxiaowang2022@u.northwestern.edu Lei Han Tencent Robotics X lxhan@tencent.com Animesh Garg University of Toronto, Vector Institute, NVIDIA garg@cs.toronto.edu Jianye Hao Tianjin University jianye.hao@tju.edu.cn Peng Liu Harbin Institute of Technology pengliu@hit.edu.cn Zhaoran Wang Northwestern University zhaoranwang@gmail.com
Pseudocode	Yes	We refer to Appendix B for the pseudocode of training DB model. Algorithm 1 SSE-DB
Open Source Code	Yes	The codes are available at https://github.com/Baichenjia/DB.
Open Datasets	Yes	We evaluate all methods on Atari games with high-dimensional observations. The selected 18 games are frequently used in previous approaches for efﬁcient exploration.
Dataset Splits	No	The paper uses Atari games for evaluation but does not explicitly specify exact training/validation/test splits, only that results are from training without extrinsic rewards and evaluated on those.
Hardware Specification	No	The paper does not explicitly describe the hardware used for its experiments. It mentions 'computation resources' in the acknowledgements, but no specific models or specifications.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	No	The paper discusses the overall approach and model architecture but does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed system-level training settings in the main text. It refers to Appendix D for implementation details, but these are not provided in the main paper for analysis.