Stochastic Optimization with Arbitrary Recurrent Data Sampling
Authors: William Powell, Hanbaek Lyu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally validate our results for the tasks of non-negative matrix factorization and logistic regression. We find that our method is robust to data heterogeneity as it produces stable iterate trajectories while still maintaining fast convergence (see Sec. 4.2). |
| Researcher Affiliation | Academia | 1Department of Mathematics, University of Wisconsin-Madison, WI, USA. |
| Pseudocode | Yes | Algorithm 1 Incremental Majorization Minimization with Dynamic Proximal Regularization... Algorithm 2 Incremental Majorization Minimization with Diminishing Radius |
| Open Source Code | No | The paper does not provide concrete access to its own source code for the methodology described. |
| Open Datasets | Yes | We consider a randomly drawn collection of 5000 images from the MNIST (Deng, 2012) dataset |
| Dataset Splits | No | The paper describes how the dataset was structured for the experiments (e.g., divided into groups, batched into nodes) but does not provide explicit training, validation, and test dataset split percentages or counts needed for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We include here a list of hyperparamters used for the NMF experiments. For Ada Grad we used constant step size parameter η = 0.5. For both RMISO-DPR and RMISO-CPR we set ρ = 2500 for the random walk and ρ = 50 for cyclic sampling. For the diminishing radius version RMISO-DR we set rn = 1 n log(n+1). ... The hyperparameters for the logistic regression experiments were chosen as follows. For MCSAG and RMISO/MISO we took L = 2/5. ... We ran SGD with a decaying step size of the form αn = α nγ where α = 0.1 and γ = 0.5. For SGD-HB and Ada Grad we used step sizes α = 0.05 and SGD-HB momentum parameter β = 0.9. |