Stochastic Variance Reduced Optimization for Nonconvex Sparse Learning
Authors: Xingguo Li, Tuo Zhao, Raman Arora, Han Liu, Jarvis Haupt
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments demonstrate the efficiency of our method in terms of both parameter estimation and computational performance. We compare the empirical performance of the SVR-GHT algorithm with two other candidate algorithms: GHT proposed in (Jain et al., 2014) and SGHT proposed in (Nguyen et al., 2014) on both synthetic data and real data. |
| Researcher Affiliation | Academia | Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55455 Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218 Department of Operations Research and Financial Engineering, Princeton University, NJ 08544 |
| Pseudocode | Yes | We summarize the proposed stochastic variance reduced gradient hard thresholding (SVR-GHT) algorithm in Algorithm 1. Algorithm 1 Stochastic Variance Reduced Gradient Hard Thresholding Algorithm. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for the source code of the methodology described. |
| Open Datasets | Yes | We consider a sparse linear regression problem. We generate each row of the design matrix Ai Rd independently from N(0, Σ). The response vector is generated from the linear model y = Aw + ϵ, where w Rd is the regression coefficient vector, and ϵ Rn is generated from N(0, σ2I) with σ = 1. We adopt a subset of RCV1 dataset with 9625 documents and 29992 distinct words, including the classes of C15 , ECAT , GCAT , and MCAT (Cai & He, 2012). |
| Dataset Splits | No | The paper does not explicitly provide information about a validation dataset split (e.g., percentages, sample counts, or methodology for a distinct validation set). |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory amounts, or computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We set nb = 10000, d = 25000, k = 200 and k = 500. For simplicity, we choose m = n throughout our experiments. We illustrate step sizes η = 1/256, 1/512 and 1/1024. We choose k = 200 and m = n for both settings of all classes. the optimal step size η for each algorithm is chosen from a sequence of values {1/25, 1/26, . . . , 1/214}. |