Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Enabling Optimal Decisions in Rehearsal Learning under CARE Condition

Authors: Wen-Bo Du, Hao-Yi Lei, Lue Tao, Tian-Zuo Wang, Zhi-Hua Zhou

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments validate the effectiveness and efﬁciency of our method. We evaluate our proposed approach on two datasets including a synthetic dataset and a real-world dataset. The comparison results are summarized in Tab. 2 and Tab. 3, where the number of observational samples is set to 100.
Researcher Affiliation	Academia	1National Key Laboratory for Novel Software Technology, Nanjing University, China 2School of Artiﬁcial Intelligence, Nanjing University, China. Correspondence to: Zhi-Hua Zhou <EMAIL>.
Pseudocode	Yes	Algorithm 1 Projection Newton for selecting optimal action... Algorithm 2 Closed-form solution for cases where \|Y\| = 1
Open Source Code	No	The paper mentions using the stable-baselines3 library (Rafﬁn et al., 2021) for RL results, which is a third-party library, not code provided by the authors for their methodology. There is no explicit statement about the release of the authors' own source code or a link to a repository.
Open Datasets	Yes	The Bermuda dataset, which records environmental variables in the Bermuda area, is described in ecology research (Courtney et al., 2017), with available generation order of variables (Andersson & Bates, 2018). The parameters of structural equations are derived from ﬁtting linear models on normalized data (Qin et al., 2023).
Dataset Splits	No	The paper states 'We repeat the experiment under 100 random seeds for each dataset' and 'the number of observational samples is set to 100', but does not provide specific details on how these samples are split into training, validation, or test sets.
Hardware Specification	Yes	The experiments are run on a Nvidia Tesla A100 GPU and two Intel Xeon Platinum 8358 CPUs.
Software Dependencies	No	The paper mentions 'the RL results (Fig. 4) are obtained by using the stable-baselines3 library (Rafﬁn et al., 2021)'. However, it does not specify the version number of the stable-baselines3 library or other key software components used in their own implementation.
Experiment Setup	Yes	We repeat the experiment under 100 random seeds for each dataset, incluing 3 measures as follows: ... The number of observational samples is set to 100. ... feasible alteration values are set to [−1, 1] for each of them. ... Finally, the hyperparameter τ for previous rehearsal-learning methods is selected as the value that achieves the highest average AUF probability among various candidates.