Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Network Regression with Graph Laplacians

Authors: Yidong Zhou, Hans-Georg Müller

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the usefulness and good practical performance of the proposed framework with simulations and with network data arising from resting-state f MRI in neuroimaging, as well as New York taxi records.
Researcher Affiliation Academia Yidong Zhou EMAIL Department of Statistics University of California Davis, CA 95616, USA; Hans-Georg M uller EMAIL Department of Statistics University of California Davis, CA 95616, USA
Pseudocode No The paper describes methods and mathematical derivations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes R codes for the proposed regression models and numerical simulations are available at https://github.com/yidongzhou/Network-Regression-with-Graph-Laplacians.
Open Datasets Yes The yellow and green taxi trip records on pick-up and drop-offdates/times, pick-up and drop-offlocations, trip distances, itemized fares, rate types, payment types and driverreported passenger counts, collected by New York City Taxi and Limousine Commission (NYC TLC), are publicly available at https://www1.nyc.gov/site/tlc/about/tlc-tr ip-record-data.page. Additionally, NYC Coronavirus Disease 2019 (COVID-19) data are available at https://github.com/nychealth/coronavirus-data. Data used in our study were obtained from the Alzheimer s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu)
Dataset Splits Yes The estimated mean squared prediction error (MSPE) was calculated for each metric using ten-fold cross validation, averaged over 100 runs.
Hardware Specification Yes The run time of the proposed regression models for different number of nodes m using R version 4.2.0 (2022-04-22) running under Darwin on Mac Book Pro M1 are summarized in Figure 4.
Software Dependencies Yes for the practical solution of which we use the osqp (Stellato et al., 2020) package in R (R Core Team, 2022). R codes for the proposed regression models and numerical simulations are available at https://github.com/yidongzhou/Network-Regression-with-Graph-Laplacians. The run time of the proposed regression models for different number of nodes m using R version 4.2.0 (2022-04-22) running under Darwin on Mac Book Pro M1 are summarized in Figure 4.
Experiment Setup Yes We investigated sample sizes n = 50, 100, 200, 500, 1000, with Q = 1000 Monte Carlo runs for each simulation scenario. In each iteration, random samples of pairs (Xk, Lk), k = 1, . . . , n were generated by sampling Xk U(0, 1), setting m = 10, and following the above procedure. The bandwidths for the local network regression in simulation scenarios III and IV were chosen by leave-one-out cross-validation.