reproducibilityindex.ai

Variance-Reduced and Projection-Free Stochastic Optimization

Authors: Elad Hazan, Haipeng Luo

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The theoretical improvement is also observed in experiments on real-world datasets for a multiclass classiﬁcation application. To support our theoretical results, we also conducted experiments on three large real-word datasets for a multiclass classiﬁcation application.
Researcher Affiliation	Academia	Elad Hazan EHAZAN@CS.PRINCETON.EDU Princeton University, Princeton, NJ 08540, USA Haipeng Luo HAIPENGL@CS.PRINCETON.EDU Princeton University, Princeton, NJ 08540, USA
Pseudocode	Yes	Algorithm 1 Stochastic Variance-Reduced Frank-Wolfe (SVRF) Algorithm 2 STOchastic variance-Reduced Conditional gradient sliding (STORC)
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	Three datasets are selected from the LIBSVM repository4 with relatively large number of features, categories and examples, summarized in the Table 3. 4https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology).
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For most of the parameters in these algorithms, we roughly follow what the theory suggests. For example, the size of mini-batch of stochastic gradients at round k is set to k2, k3 and k respectively for SFW, SCGS and SVRF, and is ﬁxed to 100 for the other three. The number of iterations between taking two snapshots for variance-reduced methods (SVRG, SVRF and STORC) are ﬁxed to 50. The learning rate is set to the typical decaying sequence c/k for SGD and a constant c for SVRG as the original work suggests for some best tuned c and c .