Gradient Descent with Proximal Average for Nonconvex and Composite Regularization

Authors: Wenliang Zhong, James Kwok

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on a number of synthetic and real-world data sets demonstrate the effectiveness and efficiency of the proposed optimization algorithm, and also the improved classification performance resulting from the nonconvex regularizers.
Researcher Affiliation Academia Leon Wenliang Zhong James T. Kwok Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong {wzhong, jamesk}@cse.ust.hk
Pseudocode No The paper describes the proposed algorithms using mathematical equations and textual descriptions, but does not include a formal pseudocode block or an explicitly labeled 'Algorithm' section.
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the described methodology is open-source or publicly available.
Open Datasets Yes Experiments are performed on the 20newsgroup data set2, which contains 16,242 samples with 100 binary features (words). [...] 2http://www.cs.nyu.edu/ roweis/data.html
Dataset Splits Yes We use 1% of the data for training, 80% for testing, and the rest for validation. 40% of the data are randomly chosen for training, another 20% for validation, and the rest for testing.
Hardware Specification Yes Experiments are performed on a PC with Intel i7-2600K CPU and 32GB memory.
Software Dependencies No The paper states that 'All the algorithms are implemented in MATLAB, except for the proximal step in SCP which is based on the C++ code in the SLEP package (Liu, Ji, and Ye 2009)', but it does not specify version numbers for MATLAB, C++, or the SLEP package used in their implementation.
Experiment Setup Yes For (22), we vary (K, n) in {(5, 500), (10, 1000), (20, 2000), (30, 3000)}, and set λ = K/10, θ = 0.1. For (23), we set K = 10, n = 1000, and vary (λ, θ) in {(0.1, 0.1), (1, 10), (10, 10), (100, 100)}. The stepsize η is set to 1 2Lℓ, where Lℓis the largest eigenvalue of 1 n ST S. We set ηmax = 100 Lℓand ηmin = 0.01 Lℓ. As discussed before, we check condition (15) with f (rather than ˆf) and L = 10 5.