Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Constrained Offline Policy Optimization

Authors: Nicholas Polosky, Bruno C. Da Silva, Madalina Fiterau, Jithin Jagannath

ICML 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We formally analyze our method and empirically demonstrate that it achieves stateof-the-art performance on discrete and continuous control problems, while offering the aforementioned improved, stronger, and more robust theoretical guarantees. and 4. Experiments We now empirically demonstrate that COPO achieves stateof-the-art performance on discrete and continuous control problems, while offering stronger and more robust theoretical guarantees.
Researcher Affiliation	Collaboration	1ANDRO Computational Solutions, Rome, NY 13440 2University of Massachusetts at Amherst, MA 01003.
Pseudocode	Yes	Algorithm 1 COPO Algorithm Sketch
Open Source Code	No	The paper does not explicitly state that the source code for the described methodology is publicly available, nor does it provide a link to a repository.
Open Datasets	No	The paper mentions data sets are 'collected from a uniform random policy' for Walk-Around-Grid and 'constructed 10 statistically-independent data sets by sampling from such a reward-optimal policy' for Bipedal Walker, but does not provide concrete access information (link, DOI, citation to public dataset) for these datasets.
Dataset Splits	No	The paper mentions that algorithms are 'trained' on a dataset but does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits).
Hardware Specification	No	The paper states: 'All experiments were executed on a server with a single GPU and 24 CPUs using different seeds for each independent run.' This is too general and lacks specific model numbers for the GPU or CPUs, or other detailed hardware specifications.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., library names with versions or solver versions) needed to replicate the experiment.
Experiment Setup	No	While the paper describes some experimental conditions (e.g., number of trajectories, episodes, trials, and the use of Algae DICE), it does not provide specific hyperparameter values (like learning rate, batch size, epochs) or detailed training configurations (e.g., optimizer settings, model initialization) within the main text.