Path following algorithms for $\ell_2$-regularized $M$-estimation with approximation guarantee
Authors: Yunzhang Zhu, Renxiong Liu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical results also corroborate our theoretical analysis. |
| Researcher Affiliation | Collaboration | Yunzhang Zhu Department of Statistics Ohio State University Columbus, OH 43015 zhu.219@osu.edu Renxiong Liu Statistics and Data Science team Nokia Bell Labs Murray Hill, NJ 07974 renxiong.liu@nokia-bell-labs.com |
| Pseudocode | Yes | Algorithm 1 A general path following algorithm. Input: ϵ > 0, C0 1/4, c1 1, c2 > 0, 0 < αmax 5 1 and tmax (0, ]. Output: grid points {tk}N k=1 and an approximated solution path θ(t). 1: Initialize: k = 1. 2: Compute α1 using (12). Starting from 0, iteratively calculate θ1 by minimizing ft1(θ) until (8) is satisfied for k = 1. 3: while (14) is not satisfied, do 4: Compute αk+1 using (13), update tk+1 = tk + αk+1. 5: Starting from θk, iteratively compute θk+1 by minimizing ftk+1(θ) until (8) is satisfied. 6: Update k = k + 1. 7: Interpolation: construct a solution path θ(t) through linear interpolation of {θk}N k=1 using (3). |
| Open Source Code | No | The paper mentions implementing methods in R using the Rcpp package and using LIBSVM, but does not state that its own source code for the methodology is openly available or provide a link. |
| Open Datasets | Yes | Example 3: Real data example. This example fits an ℓ2-regularized logistic regression using the a9a dataset from LIBSVM [Chang and Lin, 2011]. |
| Dataset Splits | No | The paper mentions 'training data' and 'simulated datasets' but does not provide specific details on the training/validation/test splits (e.g., percentages, sample counts, or methodology). |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for the experiments. |
| Software Dependencies | No | All methods considered in the numerical studies are implemented in R using Rcpp package [Eddelbuettel et al., 2011, Eddelbuettel, 2013]. While R and Rcpp are mentioned, specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | Example 1: Ridge regression. This example considers ridge regression over a simulated dataset. In this case, the empirical loss function is Ln(θ) = Y Xθ 2 2/(2n), where X Rn p and Y Rn denote the design matrix and the response vector. In the simulation, the data (X, Y ) are generated from the usual linear regression model Y = Xθ + ϵ, where ϵ N(0, In n), θ = (1/ p, . . . , 1/ p) , and rows of X are IID samples from N(0, Ip p). Throughout this example, we consider n = 1000 and p = 500, 10000. ... Example 2: ℓ2-regularized logistic regression. ... We simulate the data (X, Y ) from a linear discriminant analysis (LDA) model. More specifically, we sample the components of Y independently from a Bernoulli distribution with P(Yi = +1) = 1/2; i = 1, 2, . . . , n. Conditioned on Yi, Xi s are then independently drawn from N(Yiµ, σ2Ip p), where µ Rp and σ2 > 0. ... Here we choose µ = (1/ p, . . . , 1/ p) and σ2 = 1 so that the Bayes risk is Φ( 1), which is approximately 15%. Similar to Example 1, two different problem dimension are considered: n = 1000 and p = 500, 10000. ... Throughout, we let tmax = 10. |