Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Network Regression with Graph Laplacians

Authors: Yidong Zhou, Hans-Georg Müller

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the usefulness and good practical performance of the proposed framework with simulations and with network data arising from resting-state f MRI in neuroimaging, as well as New York taxi records.
Researcher Affiliation	Academia	Yidong Zhou EMAIL Department of Statistics University of California Davis, CA 95616, USA; Hans-Georg M uller EMAIL Department of Statistics University of California Davis, CA 95616, USA
Pseudocode	No	The paper describes methods and mathematical derivations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	R codes for the proposed regression models and numerical simulations are available at https://github.com/yidongzhou/Network-Regression-with-Graph-Laplacians.
Open Datasets	Yes	The yellow and green taxi trip records on pick-up and drop-oﬀdates/times, pick-up and drop-oﬀlocations, trip distances, itemized fares, rate types, payment types and driverreported passenger counts, collected by New York City Taxi and Limousine Commission (NYC TLC), are publicly available at https://www1.nyc.gov/site/tlc/about/tlc-tr ip-record-data.page. Additionally, NYC Coronavirus Disease 2019 (COVID-19) data are available at https://github.com/nychealth/coronavirus-data. Data used in our study were obtained from the Alzheimer s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu)
Dataset Splits	Yes	The estimated mean squared prediction error (MSPE) was calculated for each metric using ten-fold cross validation, averaged over 100 runs.
Hardware Specification	Yes	The run time of the proposed regression models for diﬀerent number of nodes m using R version 4.2.0 (2022-04-22) running under Darwin on Mac Book Pro M1 are summarized in Figure 4.
Software Dependencies	Yes	for the practical solution of which we use the osqp (Stellato et al., 2020) package in R (R Core Team, 2022). R codes for the proposed regression models and numerical simulations are available at https://github.com/yidongzhou/Network-Regression-with-Graph-Laplacians. The run time of the proposed regression models for diﬀerent number of nodes m using R version 4.2.0 (2022-04-22) running under Darwin on Mac Book Pro M1 are summarized in Figure 4.
Experiment Setup	Yes	We investigated sample sizes n = 50, 100, 200, 500, 1000, with Q = 1000 Monte Carlo runs for each simulation scenario. In each iteration, random samples of pairs (Xk, Lk), k = 1, . . . , n were generated by sampling Xk U(0, 1), setting m = 10, and following the above procedure. The bandwidths for the local network regression in simulation scenarios III and IV were chosen by leave-one-out cross-validation.