Sparse Learning for Stochastic Composite Optimization

Authors: Weizhong Zhang, Lijun Zhang, Yao Hu, Rong Jin, Deng Cai, Xiaofei He

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on both synthetic and real-world data sets show that the proposed methods are more effective in recovering the sparse solution and have comparable convergence rate as the state-of-the-art SCO algorithms for sparse learning.
Researcher Affiliation Academia State Key Lab of CAD&CG, College of Computer Science, Zhejiang University, Hangzhou, China Dept. of Computer Science & Eng., Michigan State University, East Lansing, MI, U.S.A.
Pseudocode Yes Algorithm 1 Sparse Learning based on Existing SCO Methods and Algorithm 2 Sparse Learning based on the Last Solution
Open Source Code No The paper does not provide any statement or link regarding the release of open-source code for the methodology described.
Open Datasets Yes Experiments on Real-world Dataset Dataset To further demonstrate the effectiveness of our methods, we conduct an experiment on the well-known MNIST dataset because it is easy to visualize the learned prediction model.Following (Chen, Lin, and Pena 2012), we consider solving a sparse linear regression problem:
Dataset Splits No The number N of training examples is set to be 50,000. Each digit has roughly 6,000 training examples and 1,000 testing examples. The paper describes train/test splits, but not a validation split.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or processor types used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers).
Experiment Setup Yes We set λ = 0.1, ρ = 0.1, d = 100, and vary σe in the range [1, 2, 3, ..., 10] in our experiments. The number N of training examples is set to be 50,000. In addition, we set the α = 0.1 for α-SGD and two proposed methods.In our experiment, we fix ρ = 0.01, and vary λ from 0.02 to 0.05. Parameter α is set to be 0.1 for α-SGD and the proposed algorithms.