Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Inferring Change Points in High-Dimensional Regression via Approximate Message Passing

Authors: Gabriel Arpino, Xiaoqi Liu, Julia Gontarek, Ramji Venkataramanan

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our theory via numerical experiments, and demonstrate the favorable performance of our estimators on both synthetic and real data in the settings of linear, logistic, and rectified linear regression.
Researcher Affiliation Academia Gabriel Arpino EMAIL Xiaoqi Liu EMAIL Julia Gontarek EMAIL Ramji Venkataramanan EMAIL Department of Engineering, University of Cambridge Cambridge, CB2 1PZ, United Kingdom
Pseudocode No The paper describes the Approximate Message Passing (AMP) algorithm in Section 3 using mathematical equations and descriptive text, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes A Python implementation of our algorithm and code to run the experiments is available at (Arpino et al., 2024a).
Open Datasets Yes We validate our theory via numerical experiments, and demonstrate the favorable performance of our estimators on both synthetic and real data in the settings of linear, logistic, and rectified linear regression. ... We consider synthetic data as well as a real Myocardial Infarctions (MI) data set from Golovenkin et al. (2020), which contains the medical information of n = 1700 patients (samples) with MI complications. ... Myocardial Infarction complications. UCI Machine Learning Repository, 2020. DOI: https://doi.org/10.24432/C53P5M.
Dataset Splits No The paper mentions using '5-fold cross-validation to select the regularisation constant' for a specific analysis on the Myocardial Infarction data set. However, it does not provide general or explicit details on training/test/validation dataset splits (e.g., percentages, sample counts, or specific predefined splits) for the synthetic data experiments or the overall evaluation of the AMP algorithm's performance.
Hardware Specification Yes The experiment took one hour to complete on an Apple M1 Max chip, whereas competing algorithms did not return an output within 2.5 hours, due to the larger signal dimension compared to Figure 3a (p = 7225 vs p = 200).
Software Dependencies No The Jacobians of these denoisers are computed using Automatic Differentiation in Python JAX (Bradbury et al., 2018). This statement mentions 'Python JAX' but does not specify version numbers for Python, JAX, or any other ancillary software components, which are necessary for full reproducibility.
Experiment Setup Yes For synthetic data, for i [n], we use i.i.d. Gaussian covariates Xi i.i.d N(0, Ip/n), and we let εi i.i.d P ε = N(0, σ2) in the linear and rectified linear models. ... all experiments are the result of at least 10 independent trials with t 15. ... Figure 1 (left) plots the Hausdorffdistance normalized by n for varying δ, for two different change point configurations Ψ. We choose p = 600, P B = N(0, I), σ = 0.1, = n/5 and fix two true change points, whose locations are indicated in the legend. The algorithm uses L = L = 3. ... AMP assumes no knowledge of the true sparsity level 0.5 and estimates the sparsity level using CV over a set of values not including the ground truth (details in Appendix E).