reproducibilityindex.ai

Redeeming intrinsic rewards via constrained optimization

Authors: Eric Chen, Zhang-Wei Hong, Joni Pajarinen, Pulkit Agrawal

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Consistent performance gains across sixty-one ATARI games validate our claim.
Researcher Affiliation	Collaboration	Eric Chen , Zhang-Wei Hong * , Joni Pajarinen & Pulkit Agrawal Improbable AI Lab, Massachusetts Institute of Technology MIT-IBM Watson AI Lab Aalto University NSF AI Institute for AI and Fundamental Interactions (IAIFI)
Pseudocode	Yes	Algorithm 1 Extrinsic-Intrinsic Policy Optimization (EIPO)
Open Source Code	Yes	The code is available at https://github.com/Improbable-AI/eipo.
Open Datasets	Yes	We conducted experiments on ATARI games [20], the de-facto benchmark for exploration methods [4, 11].
Dataset Splits	No	The paper uses standard ATARI benchmarks but does not explicitly detail the train/validation/test splits (e.g., percentages or specific counts) for reproducibility.
Hardware Specification	Yes	When working with image inputs (e.g., ATARI), sharing the convolutional neural network (CNN) backbone between E and E+I helps save memory, which is important when using GPUs (in our case, an NVIDIA RTX 3090Ti).
Software Dependencies	No	The paper mentions using PPO [13] and Pycolab [19], but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	Pseudo-code can be found in Algorithm 2, and full implementation details including hyperparameters can be found in Appendix A.2.