Sparse Learning for Stochastic Composite Optimization
Authors: Weizhong Zhang, Lijun Zhang, Yao Hu, Rong Jin, Deng Cai, Xiaofei He
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on both synthetic and real-world data sets show that the proposed methods are more effective in recovering the sparse solution and have comparable convergence rate as the state-of-the-art SCO algorithms for sparse learning. |
| Researcher Affiliation | Academia | State Key Lab of CAD&CG, College of Computer Science, Zhejiang University, Hangzhou, China Dept. of Computer Science & Eng., Michigan State University, East Lansing, MI, U.S.A. |
| Pseudocode | Yes | Algorithm 1 Sparse Learning based on Existing SCO Methods and Algorithm 2 Sparse Learning based on the Last Solution |
| Open Source Code | No | The paper does not provide any statement or link regarding the release of open-source code for the methodology described. |
| Open Datasets | Yes | Experiments on Real-world Dataset Dataset To further demonstrate the effectiveness of our methods, we conduct an experiment on the well-known MNIST dataset because it is easy to visualize the learned prediction model.Following (Chen, Lin, and Pena 2012), we consider solving a sparse linear regression problem: |
| Dataset Splits | No | The number N of training examples is set to be 50,000. Each digit has roughly 6,000 training examples and 1,000 testing examples. The paper describes train/test splits, but not a validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or processor types used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers). |
| Experiment Setup | Yes | We set λ = 0.1, ρ = 0.1, d = 100, and vary σe in the range [1, 2, 3, ..., 10] in our experiments. The number N of training examples is set to be 50,000. In addition, we set the α = 0.1 for α-SGD and two proposed methods.In our experiment, we fix ρ = 0.01, and vary λ from 0.02 to 0.05. Parameter α is set to be 0.1 for α-SGD and the proposed algorithms. |