reproducibilityindex.ai

Spectral Preconditioning for Gradient Methods on Graded Non-convex Functions

Authors: Nikita Doikov, Sebastian U Stich, Martin Jaggi

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our theory is validated by numerical experiments executed on multiple practical machine learning problems. 8. Experiments We present illustrative numerical experiments on several machine learning problems. See Section A in the appendix for the details of our experiments and for extra plots.
Researcher Affiliation	Academia	1Machine Learning and Optimization Laboratory (MLO), EPFL, Lausanne, Switzerland 2CISPA Helmholtz Center for Information Security, Saarbrücken, Germany.
Pseudocode	Yes	Algorithm 1 Adaptive Gradient Method with Spectral Preconditioning
Open Source Code	No	The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	In the following experiments, we train a convex logistic regression model on several machine learning datasets, using the gradient method with spectral preconditioning. We also compare its performance with quasi-Newton methods: BFGS and the limited memory BFGS (L-BFGS) (Nocedal & Wright, 2006). The results are shown in Fig. 6... https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
Dataset Splits	No	The paper mentions datasets used but does not specify the exact training, validation, or test splits (percentages, counts, or predefined splits) for its own experiments.
Hardware Specification	No	The paper does not explicitly describe any specific hardware components (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or versions (e.g., Python 3.x, PyTorch 1.x) that were used for the experiments.
Experiment Setup	Yes	For the spectral preconditioning, we fix the regularization parameter, according to our theory, at iteration k 0: αk = p L f(Xk, Yk) + βk, where we fix L := 1 and βk is fitted using a simple adaptive search... Namely, we start with an initial value of β0 := 0.05.