reproducibilityindex.ai

Finito: A faster, permutable incremental gradient method for big data problems

Authors: Aaron Defazio, Justin Domke, Caetano

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We give empirical results showing state of the art performance. In this section we compare Finito, SAG, SDCA and LBFGS.
Researcher Affiliation	Academia	NICTA and Australian National University
Pseudocode	Yes	2.2. The Finito algorithm. The step for iteration k, is as follows: 1. Update w using the step: w(k) = φ(k) 1 αsn i f i(φ(k) i ). 2. Pick an index j uniformly at random, or using without-replacement sampling as discussed in Section 3. 3. Set φ(k+1) j = w(k) in the table and leave the other variables the same (φ(k+1) i = φ(k) i for i = j). 4. Calculate and store f j(φ(k+1) j ) in the table.
Open Source Code	No	The paper does not provide any specific links or statements about the availability of source code for the described methodology.
Open Datasets	Yes	For classiﬁcation, we tested on the ijcnn1 and covtype datasets1, as well as MNIST2 classifying 04 against 5-9. For regression, we choose the two datasets from the UCI repository: the million song year regression dataset, and the slice-localization dataset. 1http://www.csie.ntu.edu.tw/ cjlin/ libsvmtools/datasets/binary.html 2http://yann.lecun.com/exdb/mnist/
Dataset Splits	No	The paper mentions 'The training portion of the datasets are of size...' for each dataset, but does not provide specific details on how the data was split into training, validation, and test sets, or the percentages/counts for each split.
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or version numbers used in the experiments.
Experiment Setup	Yes	For Finito, we ﬁnd that using α = 2 is the fastest rate when the big data condition holds for any β > 1. This is the step suggested by our theory when β = 2. Interestingly, reducing α to 1 does not improve the convergence rate. Instead we see no further improvement in our experiments. For Finito, during the ﬁrst pass, since we do not have derivatives for each φi yet, we simply sum over the k terms seen so far i φ(k) i 1 αsk i f i(φ(k) i ), where we process data points in index order for the ﬁrst pass only.