Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Understanding Fixed Predictions via Confined Regions

Authors: Connor Lawless, Tsui-Wei Weng, Berk Ustun, Madeleine Udell

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct a comprehensive empirical study of confined regions across diverse applications. Our results highlight that existing pointwise verification methods fail to anticipate future individuals with fixed predictions, while our method both identifies them and provides an interpretable description.
Researcher Affiliation	Academia	1Stanford University 2University of California, San Diego. Correspondence to: Connor Lawless <EMAIL>.
Pseudocode	No	The paper describes methods using mathematical formulations and textual descriptions but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We include code to reproduce our results at https://github.com/conlaw/ confined_regions/ and provide additional details and results in Appendix E.
Open Datasets	Yes	We evaluate our approach on three real-world datasets in consumer finance (heloc (FICO, 2018), givemecredit(Kaggle, 2011)) and content moderation (twitterbot (Gilani et al., 2016)).
Dataset Splits	Yes	We split the processed dataset into a training sample (50% used to train the model), and an audit sample (used to evaluate responsiveness in deployment).
Hardware Specification	Yes	We run all experiments on a personal computer with an Apple M1 Pro chip and 32 GB of RAM.
Software Dependencies	Yes	All MILP and MIQCP problems were solved using Gurobi 9.0 (Achterberg, 2019) with default settings.
Experiment Setup	Yes	We use the training dataset to fit a ℓ1-regularized logistic regression model and tune its parameters via cross-validation.