Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Automatic Successive Reinforcement Learning with Multiple Auxiliary Rewards

Authors: Zhao-Yang Fu, De-Chuan Zhan, Xin-Chun Li, Yi-Xing Lu

IJCAI 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments and simulations have shown the superiority of our proposed ASR on a range of environments, including Open AI classical control domains and video games; Freeway and Catcher.
Researcher Affiliation	Academia	National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China EMAIL, EMAIL, Yixing EMAIL
Pseudocode	Yes	A summary of our ASR framework is shown in Algorithm 1.
Open Source Code	No	The paper references 'Open AI Baselines' with a GitHub link, but does not provide a specific link or statement for the source code of their proposed ASR framework.
Open Datasets	Yes	Open AI classical control domains Mountain Car, Cart Pole and Acrobot [Brockman et al., 2016], PLE game Catcher1 and Atari game Freeway.
Dataset Splits	No	The paper does not explicitly provide details on training/validation/test dataset splits, or how data is partitioned for a validation set.
Hardware Specification	No	No specific hardware (GPU/CPU models, memory, etc.) used for running experiments is explicitly mentioned in the paper.
Software Dependencies	No	The paper mentions 'Open AI Baselines' but does not provide specific version numbers for software dependencies or libraries used.
Experiment Setup	Yes	For classical control environments, we perform one million time steps of training for each method. For video games, we perform ten million time steps of training for each method. Each method is run with ﬁve random seeds.