Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Momentum-Based Policy Gradient with Second-Order Information

Authors: Saber Salehkaleybar, Mohammadsadegh Khorasani, Negar Kiyavash, Niao He, Patrick Thiran

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experimental evaluations show the effectiveness of the proposed algorithm on various control tasks and its advantage over the state of the art in practice. ... In this section, we evaluate the performance of the proposed algorithm and compare it with previous work for control tasks in Mu Jo Co simulator (Todorov et al., 2012) ... Figure 1: Comparison of SHARP with other variance reduction methods on four control tasks. Table 2: Comparison of SHARP with other variance-reduced methods in terms of PR.
Researcher Affiliation	Academia	Saber Salehkaleybar EMAIL Leiden Institute of Advanced Computer Science Leiden University Sadegh Khorasani EMAIL School of Computer and Communication Sciences EPFL Negar Kiyavash EMAIL College of Management of Technology EPFL Niao He EMAIL Department of Computer Science ETH Zurich Patrick Thiran EMAIL School of Computer and Communication Sciences EPFL
Pseudocode	Yes	Algorithm 1 Common framework in variance reduction methods... Algorithm 2 The SHARP algorithm
Open Source Code	Yes	We implemented SHARP in the Garage library (garage contributors, 2019) as it allows for maintaining and integrating it in future versions of Garage library for easier dissemination. We utilized a Linux server with Intel Xeon CPU E5-2680 v3 (24 cores) operating at 2.50GHz with 377 GB DDR4 of memory, and Nvidia Titan X Pascal GPU. The implementation of SHARP is available as supplementary material.
Open Datasets	No	The paper uses control tasks (Reacher, Walker, Humanoid, and Swimmer) in Mu Jo Co simulator. These are descriptions of environments/tasks, not explicit datasets with specific access information (links, DOIs, citations for data availability) in the traditional sense. The paper does not provide concrete access information for publicly available datasets.
Dataset Splits	No	The paper describes generating trajectories according to the current policy during the experimental process: "at each iteration t, we generated trajectories according to the current policy until we collected 10k system probes." This refers to online data collection rather than predefined dataset splits for training, validation, or testing.
Hardware Specification	Yes	We utilized a Linux server with Intel Xeon CPU E5-2680 v3 (24 cores) operating at 2.50GHz with 377 GB DDR4 of memory, and Nvidia Titan X Pascal GPU.
Software Dependencies	No	We implemented SHARP in the Garage library (garage contributors, 2019)... for control tasks in Mu Jo Co simulator (Todorov et al., 2012). While the paper mentions the Garage library and Mu Jo Co simulator, it does not provide specific version numbers for these or any other software components.
Experiment Setup	Yes	For each algorithm, we used the same set of Gaussian policies parameterized with neural networks having two layers of 64 neurons each. Baselines and environment settings (such as maximum trajectory horizon, and reward intervals) were considered the same for all algorithms. We chose a maximum horizon of 500 for Walker, Swimmer, and Humanoid and 50 for Reacher. ... The discount factor is also set to 0.99 for all the runs. ... In the following table, we provide the fine-tuned parameters for each algorithm. Table 4: Selected hyper-parameters for different methods.