Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Inferring Change Points in High-Dimensional Regression via Approximate Message Passing

Authors: Gabriel Arpino, Xiaoqi Liu, Julia Gontarek, Ramji Venkataramanan

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our theory via numerical experiments, and demonstrate the favorable performance of our estimators on both synthetic and real data in the settings of linear, logistic, and rectiﬁed linear regression.
Researcher Affiliation	Academia	Gabriel Arpino EMAIL Xiaoqi Liu EMAIL Julia Gontarek EMAIL Ramji Venkataramanan EMAIL Department of Engineering, University of Cambridge Cambridge, CB2 1PZ, United Kingdom
Pseudocode	No	The paper describes the Approximate Message Passing (AMP) algorithm in Section 3 using mathematical equations and descriptive text, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	A Python implementation of our algorithm and code to run the experiments is available at (Arpino et al., 2024a).
Open Datasets	Yes	We validate our theory via numerical experiments, and demonstrate the favorable performance of our estimators on both synthetic and real data in the settings of linear, logistic, and rectiﬁed linear regression. ... We consider synthetic data as well as a real Myocardial Infarctions (MI) data set from Golovenkin et al. (2020), which contains the medical information of n = 1700 patients (samples) with MI complications. ... Myocardial Infarction complications. UCI Machine Learning Repository, 2020. DOI: https://doi.org/10.24432/C53P5M.
Dataset Splits	No	The paper mentions using '5-fold cross-validation to select the regularisation constant' for a specific analysis on the Myocardial Infarction data set. However, it does not provide general or explicit details on training/test/validation dataset splits (e.g., percentages, sample counts, or specific predefined splits) for the synthetic data experiments or the overall evaluation of the AMP algorithm's performance.
Hardware Specification	Yes	The experiment took one hour to complete on an Apple M1 Max chip, whereas competing algorithms did not return an output within 2.5 hours, due to the larger signal dimension compared to Figure 3a (p = 7225 vs p = 200).
Software Dependencies	No	The Jacobians of these denoisers are computed using Automatic Diﬀerentiation in Python JAX (Bradbury et al., 2018). This statement mentions 'Python JAX' but does not specify version numbers for Python, JAX, or any other ancillary software components, which are necessary for full reproducibility.
Experiment Setup	Yes	For synthetic data, for i [n], we use i.i.d. Gaussian covariates Xi i.i.d N(0, Ip/n), and we let εi i.i.d P ε = N(0, σ2) in the linear and rectiﬁed linear models. ... all experiments are the result of at least 10 independent trials with t 15. ... Figure 1 (left) plots the Hausdorﬀdistance normalized by n for varying δ, for two diﬀerent change point conﬁgurations Ψ. We choose p = 600, P B = N(0, I), σ = 0.1, = n/5 and ﬁx two true change points, whose locations are indicated in the legend. The algorithm uses L = L = 3. ... AMP assumes no knowledge of the true sparsity level 0.5 and estimates the sparsity level using CV over a set of values not including the ground truth (details in Appendix E).