reproducibilityindex.ai

SEGA: Variance Reduction via Gradient Sketching

Authors: Filip Hanzely, Konstantin Mishchenko, Peter Richtarik

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we perform numerical experiments to illustrate the potential of SEGA. Firstly, in Section 5.1, we compare it to projected gradient descent (PGD) algorithm. Then in Section 5.2, we study the performance of zeroth-order SEGA (when sketched gradients are being estimated through function value evaluations) and compare it to the analogous zeroth-order method. Lastly, in Section 5.3 we verify the claim from Remark 3.6 that in some applications, particular sketches and metric might lead to a signiﬁcantly faster convergence. In the experiments where theory-supported stepsizes were used, we obtained them by precomputing strong convexity and smoothness measures.
Researcher Affiliation	Academia	Filip Hanzely1 Konstantin Mishchenko1 Peter Richt arik1,2,3 1 King Abdullah University of Science and Technology, 2University of Edinburgh, 3Moscow Institute of Physics and Technology
Pseudocode	Yes	Algorithm 1: SEGA: Sk Etched Gr Adient Method; Algorithm 2: ASEGA: Accelerated SEGA
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the methodology described is publicly available.
Open Datasets	No	For f, we choose 4 different quadratics, see Table 2 (appendix). We stress that these are synthetic problems generated for the purpose of illustrating the potential of our method against a natural baseline. For illustration, we choose f to be a quadratic problem based on Table 2 and compare both Gaussian and coordinate sketches. The paper uses synthetic problems or refers to a table for problem parameters, but does not provide access information or citations for any publicly available or open datasets.
Dataset Splits	No	The paper describes synthetic problems and experimental setups but does not provide specific dataset split information (percentages, sample counts, or methodology) needed for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup	Yes	Stepsizes 1/λmax(M) and 1/(nλmax(M)) were used for PGD and SEGA, respectively. Theory supported stepsizes were chosen for both methods. Stepsize chosen according to theory.