One-shot Distributed Ridge Regression in High Dimensions
Authors: Yue Sheng, Edgar Dobriban
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results are supported by simulations and real data analysis. ... Section 4 contains experiments on real data. ... We confirm these results in detailed simulation studies and on an empirical data example, using the Million Song Dataset. |
| Researcher Affiliation | Academia | 1Wharton Statistics Department, University of Pennsylvania, Philadelphia, PA, USA 2Graduate Group in Applied Mathematics and Computational Science, University of Pennsylvania, Philadelphia, PA, USA. |
| Pseudocode | Yes | Algorithm 1: Optimally weighted distributed ridge regression |
| Open Source Code | No | No statement or link providing concrete access to source code for the methodology was found. |
| Open Datasets | Yes | Million Song Year Prediction Dataset (MSD) (Bertin-Mahieux et al., 2011). ... We download the dataset from the UC Irvine Machine Learning Repository. |
| Dataset Splits | Yes | For each experiment, we randomly choose ntrain = 10,000 samples from the training set and ntest = 1,000 samples from the test set. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for experiments are mentioned. |
| Software Dependencies | No | No specific software versions (e.g., Python 3.8, PyTorch 1.9) are mentioned. |
| Experiment Setup | Yes | We choose the number of machines to be k = 1, 10, 20, 50, 100, 500, 1, 000, and we distribute the data evenly across the k machines. ... We repeat the experiment T = 100 times... ... The estimator using only a fraction 1/k of the data, which is just one of the local estimators. For this estimator, we choose the tuning parameter λ = kp/(ntrain ˆα2). |