reproducibilityindex.ai

SEBOOST - Boosting Stochastic Learning Using Subspace Optimization Techniques

Authors: Elad Richardson, Rom Herskovitz, Boris Ginsburg, Michael Zibulevsky

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The method was evaluated on several deep learning tasks, demonstrating signiﬁcant improvement in performance.
Researcher Affiliation	Collaboration	1Technion, Israel Institute of Technology 2Nvidia INC
Pseudocode	Yes	Algorithm 1 The SEBOOST algorithm, Algorithm 2 Controlling anchors in SEBOOST
Open Source Code	Yes	Our algorithm was implemented and evaluated using the Torch7 framework [1], and is publicly available 1. https://github.com/eladrich/seboost
Open Datasets	Yes	The MNIST dataset was used, with 60,000 training images of size 28 × 28 and 10,000 test images. For classiﬁcation purposes a standard benchmark is the CIFAR-10 dataset. The dataset was divided into 18,000 training examples and 2,000 test examples.
Dataset Splits	No	The paper specifies training and test splits for the datasets but does not explicitly mention a separate validation set or describe a validation split method (e.g., cross-validation).
Hardware Specification	No	The paper does not specify any particular hardware components such as CPU or GPU models used for running the experiments. It only mentions "actual processor time" generally.
Software Dependencies	No	The paper mentions using "the Torch7 framework [1]" but does not specify a version number for Torch7 or any other software libraries or dependencies.
Experiment Setup	Yes	The main hyper-parameters that were altered during the experiments were: lrmethod The learning rate of a baseline method. M Maximal number of old directions. ℓNumber of baseline steps between each subspace optimization. For all experiments the weight decay was set at 0.0001 and the momentum was ﬁxed at 0.9 for SGD and NAG. Unless stated otherwise, the number of function evaluations for CG was set at 20. The baseline method used a mini-batch of size 100, while the subspace optimization was applied with a mini-batch of size 1000.