Efficient Continual Finite-Sum Minimization
Authors: Ioannis Mavrothalassitis, Stratis Skoulakis, Leello Tadesse Dadi, Volkan Cevher
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 EXPERIMENTS We experimentally evaluate the methods (SGD,SGD-sparse, Katyusha, SVRG and CSVRG) on a ridge regression task. Given some dataset (ai, bi)n i=1 Rd R, at each stage i [n], we consider the the finite sum objective gi(x) := Pi j=1(a j x bj)2/i + λ x 2 2 with λ = 10 3. We choose the latter setting so as to be able to compute the exact optimal solution at each stage i [n]. For our experiments we use the datasets found in the LIBSVM package Chang and Lin (2011) for which we report our results. |
| Researcher Affiliation | Academia | Ioannis Mavrothalassitis , Stratis Skoulakis , Leello Tadesse Dadi, Volkan Cevher LIONS, École Polytechnique Fédérale de Lausanne {ioannis.mavrothalassitis, efstratios.skoulakis, leello.dadi, volkan.cevher}@epfl.ch |
| Pseudocode | Yes | Algorithm 1 CSVRG |
| Open Source Code | Yes | Reproducibility Statement In the appendix we have included the formal proofs of all the theorems and lemmas provided in the main part of the paper. In order to facilitate the reproducibility of our theoretical results, for each theorem we have created a separate self-contained section presenting its proof. Concerning the experimental evaluations of our work, we provide the code used in our experiments as well as the selected parameters in each of the presented methods. |
| Open Datasets | Yes | For our experiments we use the datasets found in the LIBSVM package Chang and Lin (2011) for which we report our results. ... In this section we include additional experimental evaluations CSVRG,SGD and SVRG for the continual finite-sum setting in the context of 2-Layer Neural Networks and the MNIST dataset. |
| Dataset Splits | No | No explicit statement about train/validation/test splits, specific percentages, or cross-validation setup. The paper mentions processing data points in 'stages' where new data arrives over time. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory specifications) are mentioned for running the experiments. The paper focuses on algorithmic complexity and experimental results on datasets. |
| Software Dependencies | No | No specific software dependencies with version numbers are provided. While it mentions the use of 'LIBSVM package' and implies deep learning frameworks for neural networks, it does not specify versions (e.g., 'PyTorch 1.9'). |
| Experiment Setup | Yes | For our experiments we use the datasets found in the LIBSVM package Chang and Lin (2011) for which we report our results. At each stage i [n], we reveal a new data point (ai, bi). In all of our experiments we run CSVRG with α = 0.3 and Ti = 100. The inner iterations of SGD and SGD-sparse are appropriately selected so that their overall FO calls match the FO calls of CSVRG. At each stage i [n] we run Katyusha Allen-Zhu (2017) and SVRG Johnson and Zhang (2013) on the prefix-sum function gi(x) with 10 outer iterations and 100 inner iterations. In Appendix K.2 we present further experimental evaluations as well as the exact parameters used for each method. |