reproducibilityindex.ai

Adaptive Proximal Gradient Methods Are Universal Without Approximation

Authors: Konstantinos Oikonomidis, Emanuel Laude, Puya Latafat, Andreas Themelis, Panagiotis Patrinos

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments make comparisons with baseline methods on diverse tasks from machine learning covering both the locally and the globally Hölder setting. In numerical simulations we show that ada PGq, q 2 performs well on a collection of locally and globally Hölder smooth problems, such as classification with Höldersmooth SVMs and a p-norm version of Lasso.
Researcher Affiliation	Academia	1Department of Electrical Engineering (ESAT-STADIUS), KU Leuven, Kasteelpark Arenberg 10, 3001 Leuven, Belgium 2Faculty of Information Science and Electrical Engineering (ISEE), Kyushu University, 744 Motooka, Nishi-ku 819-0395, Fukuoka, Japan. Correspondence to: Konstantinos Oikonomidis <konstantinos.oikonomidis@kuleuven.be>.
Pseudocode	Yes	Algorithm 1 Ada PGq, q 2 : A universal adaptive proximal gradient method Require: starting point x 1 n, stepsizes γ0 γ 1 > 0, parameter q [1, 2] Initialize: x0 = proxγ0g(x 1 γ0 f(x 1)) Repeat for k = 0, 1, . . . until convergence 1: With ℓk, Lk given in (6), define the stepsize as (with notation [z]+ = max {z, 0} and convention 1 γk+1 = γk min 1 q + γk γk 1 , 1 q 2[γ2 k L2 k (2 q)γkℓk + 1 q]+ 2: xk+1 = proxγk+1g(xk γk+1 f(xk))
Open Source Code	Yes	The implementation for reproducing the experiments is publicly available.2 https://github.com/Emanuel Laude/ universal-adaptive-proximal-gradient
Open Datasets	Yes	In this subsection we assess the performance of the different algorithms on the task of classification using a p-norm version of the SVM model. For that purpose we consider a subset of the Lib SVM binary classification benchmark that consists of a collection of datasets with varying number m of examples ai, feature dimensions n and sparsity.
Dataset Splits	No	The paper mentions using datasets like the Lib SVM benchmark and generating random instances, but does not explicitly provide details about training, validation, and test splits (e.g., percentages, sample counts, or specific split files) for reproducibility.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions comparing with NUPG, F-NUPG, and AC-FGM methods and providing an implementation of their method, but it does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	Regarding the positive sequences (γk)k , (βk)k , (τk)k , we use the update rule described in (Li & Lan, 2023, Cor. 3) and as such in Corollary 2 of the same paper. We choose β1 = 0 and βk = β = 1 3 2 for all k 2, ensure that γ1 [ β 4(1 β)c1 , 1 3c1 ] and set γ2 = β 2c1 and γk+1 = min{ τk 1+1 4ck } for all k 2. Finally, τ1 = 0, τ2 = 2 and τk+1 = τk + α 2 + 2(1 α)γkck βτk , for k 2 and some α [0, 1]. We chose α = 0, since this configuration consistently outperformed the others in our experiments.