reproducibilityindex.ai

Ridge Regression: Structure, Cross-Validation, and Sketching

Authors: Sifan Liu, Edgar Dobriban

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our results are illustrated by simulations and by analyzing empirical data. We provide proofs and additional simulations in the Appendix. Code reproducing the experiments in the paper are available at https://github. com/liusf15/Ridge Regression.
Researcher Affiliation	Academia	Sifan Liu Department of Statistics Stanford University Stanford, CA 94305, USA sfliu@stanford.edu, Edgar Dobriban Department of Statistics University of Pennsylvania Philadelphia, PA 19104, USA dobriban@wharton.upenn.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code reproducing the experiments in the paper are available at https://github. com/liusf15/Ridge Regression.
Open Datasets	Yes	Left: Cross-validation on the Million Song Dataset (MSD, Bertin-Mahieux et al., 2011). For the test error, we train on 1000 training datapoints and ﬁt on 9000 test datapoints. Right: Cross-validation on the ﬂights dataset Wickham (2018). For the test error, we train on 300 datapoints and ﬁt on 27000 test datapoints. Suppose we split the n datapoints (samples) into K equal-sized subsets, each containing n0 = n/K samples. We use the k-th subset (Xk, Yk) as the validation set and the other K 1 subsets (X k, Y k), with total sample size n1 = (K 1)n/K as the training set. Left: we generate a training set (n = 1000, p = 700, γ = 0.7, α = σ = 1) and a test set (ntest = 500) from the same distribution. We split the training set into K = 5 equally sized folds and do cross-validation. We take n = 500, p = 550, α = 20, σ = 1, K = 5. As for train-test validation, we take 80% of samples to be training set and the rest 20% be test set.
Dataset Splits	Yes	For the error bar, we take n = 1000, p = 90, K = 5, and average over 90 different sub-datasets. For the error bar, we take n = 300, p = 21, K = 5, and average over 180 different sub-datasets. Suppose we split the n datapoints (samples) into K equal-sized subsets, each containing n0 = n/K samples. We use the k-th subset (Xk, Yk) as the validation set and the other K 1 subsets (X k, Y k), with total sample size n1 = (K 1)n/K as the training set. We split the training set into K = 5 equally sized folds and do cross-validation. We take n = 500, p = 550, α = 20, σ = 1, K = 5. As for train-test validation, we take 80% of samples to be training set and the rest 20% be test set.
Hardware Specification	No	The paper discusses computational complexity in terms of flop counts but does not specify any particular hardware (e.g., CPU, GPU models, memory) used for the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	Figure 1: Left: γ = p/n = 0.2; right: γ = 2. The data matrix X has iid Gaussian entries. The coefﬁcient β has distribution β N(0, Ip/p), while the noise ε N(0, Ip). Figure 2: For the error bar, we take n = 1000, p = 90, K = 5, and average over 90 different sub-datasets. For the test error, we train on 1000 training datapoints and ﬁt on 9000 test datapoints. Figure 3: Primal orthogonal sketching with n = 500, γ = 5, λ = 1.5, α = 3, σ = 1. Left: MSE of primal sketching normalized by the MSE of ridge regression. The error bar is the standard deviation over 10 repetitions. Figure 4: Right: Gaussian dual sketch when there is no noise. γ = 0.4, α = 1, λ = 1 (both for original and sketching). Standard error over 50 experiments. Figure 7: Left: we generate a training set (n = 1000, p = 700, γ = 0.7, α = σ = 1) and a test set (ntest = 500) from the same distribution. We split the training set into K = 5 equally sized folds and do cross-validation.