Stochastic Optimization for Non-convex Inf-Projection Problems
Authors: Yan Yan, Yi Xu, Lijun Zhang, Wang Xiaoyu, Tianbao Yang
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments to verify the efficacy of the inf-projection formulation and proposed stochastic algorithms in comparison to the stochastic algorithms for solving minmax formulation (9). We perform two experiments on four datasets, i.e., a9a, RCV1, covtype and URL from the libsvm website, whose number of examples are n = 32561, 581012, 697641 and 2396130, respectively (Table 2). For each dataset, we randomly sample 80% as training data and the rest as testing data. |
| Researcher Affiliation | Collaboration | 1University of Iowa 2DAMO Academy, Alibaba Group 3Nanjing University 4The Chinese University of Hong Kong (Shenzhen). |
| Pseudocode | Yes | Algorithm 1 MSPG, Algorithm 2 St-SPG, Algorithm 3 SPG |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, such as a specific repository link, an explicit code release statement, or code in supplementary materials. |
| Open Datasets | Yes | We perform two experiments on four datasets, i.e., a9a, RCV1, covtype and URL from the libsvm website, whose number of examples are n = 32561, 581012, 697641 and 2396130, respectively (Table 2). |
| Dataset Splits | No | The paper states, 'For each dataset, we randomly sample 80% as training data and the rest as testing data,' but does not provide specific details on a separate validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | We tune hyper-parameters from a reasonable range, i.e., for St-SPG, λ {10 5:2}, γ, µ {10 3:3}. For BMD and BMD-eff, we tune step size ηP {10 8: 15} for updating P, step size ηθ {10 5:3} for updating θ, ρ {n 10 3:3} and fix δ = 10 5. For MSPG, we tune λ {10 5:2}, the step size parameter c in Proposition 1 from {10 5:2}. Hyper-parameters of PGSMD and PGSMD-eff including ηP , ηθ, ρ and δ are selected in the same range as in the first experiment. The weak convexity parameter ρwc are chosen from {10 5:5}. |