Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Redeeming intrinsic rewards via constrained optimization

Authors: Eric Chen, Zhang-Wei Hong, Joni Pajarinen, Pulkit Agrawal

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Consistent performance gains across sixty-one ATARI games validate our claim.
Researcher Affiliation Collaboration Eric Chen , Zhang-Wei Hong * , Joni Pajarinen & Pulkit Agrawal Improbable AI Lab, Massachusetts Institute of Technology MIT-IBM Watson AI Lab Aalto University NSF AI Institute for AI and Fundamental Interactions (IAIFI)
Pseudocode Yes Algorithm 1 Extrinsic-Intrinsic Policy Optimization (EIPO)
Open Source Code Yes The code is available at https://github.com/Improbable-AI/eipo.
Open Datasets Yes We conducted experiments on ATARI games [20], the de-facto benchmark for exploration methods [4, 11].
Dataset Splits No The paper uses standard ATARI benchmarks but does not explicitly detail the train/validation/test splits (e.g., percentages or specific counts) for reproducibility.
Hardware Specification Yes When working with image inputs (e.g., ATARI), sharing the convolutional neural network (CNN) backbone between E and E+I helps save memory, which is important when using GPUs (in our case, an NVIDIA RTX 3090Ti).
Software Dependencies No The paper mentions using PPO [13] and Pycolab [19], but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Pseudo-code can be found in Algorithm 2, and full implementation details including hyperparameters can be found in Appendix A.2.