Dual Iterative Hard Thresholding: From Non-convex Sparse Minimization to Non-smooth Concave Maximization
Authors: Bo Liu, Xiao-Tong Yuan, Lezi Wang, Qingshan Liu, Dimitris N. Metaxas
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical results demonstrate the superiority of dual IHT algorithms to the state-of-the-art primal IHT-style algorithms in model estimation accuracy and computational efficiency. This section dedicates in demonstrating the accuracy and efficiency of the proposed algorithms. We first show the model estimation performance of DIHT when applied to sparse ridge regression models on synthetic datasets. Then we evaluate the efficiency of DIHT/SDIHT on sparse ℓ2regularized Huber loss and Hinge loss minimization tasks using real-world datasets. |
| Researcher Affiliation | Academia | 1Department of CS, Rutgers University, Piscataway, NJ, 08854, USA. 2B-DAT Lab, Nanjing University of Information Science & Technology, Nanjing, Jiangsu, 210044, China. |
| Pseudocode | Yes | Algorithm 1 Dual Iterative Hard Thresholding (DIHT) Input : Training set {xi, yi}N i=1. Regularization strength parameter λ. Cardinality constraint k. Step-size η. Initialization w(0) = 0, α(0) 1 = ... = α(0) N = 0. for t = 1, 2, ..., T do (S1) Dual projected super-gradient ascent: i {1, 2, ..., N}, α(t) i = PF α(t 1) i + η(t 1)g(t 1) i , (6) where g(t 1) i = 1 N (x i w(t 1) l i (α(t 1) i )) is the super-gradient and PF( ) is the Euclidian projection operator with respect to feasible set F. (S2) Primal hard thresholding: w(t) = Hk 1 λN PN i=1 α(t) i xi . (7) end Output: w(T ). |
| Open Source Code | No | The paper does not provide any specific links or explicit statements about the availability of open-source code for the described methodology. |
| Open Datasets | Yes | Two binary benchmark datasets from Lib SVM data repository1, RCV1 (d = 47, 236) and News20 (d = 1, 355, 191), are used for algorithm efficiency evaluation and comparison. 1https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html |
| Dataset Splits | Yes | We use 50 data copies as validation set to select the parameter λ from {10-6, ..., 102} and the percentage of successful support recovery is evaluated on the other 100 data copies. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions "Lib SVM data repository" as a source for datasets but does not provide specific ancillary software details with version numbers for libraries or solvers used in their implementation. |
| Experiment Setup | Yes | The parameter update step-size of all the considered algorithms is tuned by grid search. The parameter γ is set to be 0.25. For the two stochastic algorithms SDIHT and SVRGHT we randomly partition the training data into |B| = 10 mini-batches. |