reproducibilityindex.ai

A Simple Practical Accelerated Method for Finite Sums

Authors: Aaron Defazio

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We tested our algorithm which we call Point-SAGA against SAGA [Defazio et al., 2014a], SDCA [Shalev-Shwartz and Zhang, 2013a], Pegasos/SGD [Shalev-Shwartz et al., 2011] and the catalyst acceleration scheme [Lin et al., 2015]. We selected a set of commonly used datasets from the LIBSVM repository [Chang and Lin, 2011]. Figure 1 shows the function value sub-optimality for each dataset-subset combination.
Researcher Affiliation	Industry	Aaron Defazio Ambiata, Sydney Australia
Pseudocode	Yes	Algorithm 1 Pick some starting point x0 and step size γ. Initialize each g0 i = f i(x0), where f i(x0) is any gradient/subgradient at x0. Then at step k + 1: 1. Pick index j from 1 to n uniformly at random. 2. Update x: zk j = xk + γ xk+1 = proxγ j zk j .
Open Source Code	Yes	Open source code to exactly replicate the experimental results is available at https://github.com/adefazio/point-saga.
Open Datasets	Yes	We selected a set of commonly used datasets from the LIBSVM repository [Chang and Lin, 2011].
Dataset Splits	No	The paper mentions using LIBSVM datasets and subsampling, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) nor does it refer to predefined standard splits for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper mentions 'Cython implementation' but does not specify version numbers for Cython or any other key software libraries or dependencies (e.g., Python, NumPy, PyTorch versions) needed for replication.
Experiment Setup	Yes	The step-size parameter for each method (κ for catalyst-SDCA) was chosen using a grid search of powers of 2. The step size that gives the lowest error at the ﬁnal epoch is used for each method. The L2 regularization constant for each problem was set by hand to ensure f was not in the big data regime n L/µ; as noted above, all the methods perform essentially the same when n L/µ. The constant used is noted beneath each plot.