Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Network Regression with Graph Laplacians
Authors: Yidong Zhou, Hans-Georg Müller
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the usefulness and good practical performance of the proposed framework with simulations and with network data arising from resting-state f MRI in neuroimaging, as well as New York taxi records. |
| Researcher Affiliation | Academia | Yidong Zhou EMAIL Department of Statistics University of California Davis, CA 95616, USA; Hans-Georg M uller EMAIL Department of Statistics University of California Davis, CA 95616, USA |
| Pseudocode | No | The paper describes methods and mathematical derivations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | R codes for the proposed regression models and numerical simulations are available at https://github.com/yidongzhou/Network-Regression-with-Graph-Laplacians. |
| Open Datasets | Yes | The yellow and green taxi trip records on pick-up and drop-offdates/times, pick-up and drop-offlocations, trip distances, itemized fares, rate types, payment types and driverreported passenger counts, collected by New York City Taxi and Limousine Commission (NYC TLC), are publicly available at https://www1.nyc.gov/site/tlc/about/tlc-tr ip-record-data.page. Additionally, NYC Coronavirus Disease 2019 (COVID-19) data are available at https://github.com/nychealth/coronavirus-data. Data used in our study were obtained from the Alzheimer s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu) |
| Dataset Splits | Yes | The estimated mean squared prediction error (MSPE) was calculated for each metric using ten-fold cross validation, averaged over 100 runs. |
| Hardware Specification | Yes | The run time of the proposed regression models for different number of nodes m using R version 4.2.0 (2022-04-22) running under Darwin on Mac Book Pro M1 are summarized in Figure 4. |
| Software Dependencies | Yes | for the practical solution of which we use the osqp (Stellato et al., 2020) package in R (R Core Team, 2022). R codes for the proposed regression models and numerical simulations are available at https://github.com/yidongzhou/Network-Regression-with-Graph-Laplacians. The run time of the proposed regression models for different number of nodes m using R version 4.2.0 (2022-04-22) running under Darwin on Mac Book Pro M1 are summarized in Figure 4. |
| Experiment Setup | Yes | We investigated sample sizes n = 50, 100, 200, 500, 1000, with Q = 1000 Monte Carlo runs for each simulation scenario. In each iteration, random samples of pairs (Xk, Lk), k = 1, . . . , n were generated by sampling Xk U(0, 1), setting m = 10, and following the above procedure. The bandwidths for the local network regression in simulation scenarios III and IV were chosen by leave-one-out cross-validation. |