reproducibilityindex.ai

Learning Deep ResNet Blocks Sequentially using Boosting Theory

Authors: Furong Huang, Jordan Ash, John Langford, Robert Schapire

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, we compare Boost Res Net with e2e BP over two types of feed-forward Res Nets, multilayer perceptron residual network (MLP-Res Net) and convolutional neural network residual network (CNN-Res Net), on multiple datasets. Boost Res Net shows substantial computational performance improvements and accuracy improvement under the MLP-Res Net architecture. Under CNN-Res Net, a faster convergence for Boost Res Net is observed.
Researcher Affiliation	Collaboration	Furong Huang 1 Jordan T. Ash 2 John Langford 3 Robert E. Schapire 3 1Department of Computer Science, University of Maryland; 2Department of Computer Science, Princeton University; 3Microsoft Research.
Pseudocode	Yes	Algorithm 1 Boost Res Net: telescoping sum boosting for binary-class classiﬁcation; Algorithm 2 Boost Res Net: oracle implementation for training a Res Net block
Open Source Code	No	The paper states that experiments were programmed in 'Torch deep learning framework for Lua' but does not provide a link or explicit statement about making the source code for their proposed method publicly available.
Open Datasets	Yes	We compare our proposed Boost Res Net algorithm with e2e BP training a Res Net on the MNIST (Le Cun et al., 1998), street view house numbers (SVHN) (Netzer et al., 2011), and CIFAR-10 (Krizhevsky & Hinton, 2009) benchmark datasets.
Dataset Splits	Yes	Hyperparameters are selected via random search for highest accuracy on a validation set.
Hardware Specification	Yes	Our experiments are programmed in the Torch deep learning framework for Lua and executed on NVIDIA Tesla P100 GPUs.
Software Dependencies	No	The paper mentions using 'Torch deep learning framework for Lua' and that 'All models are trained using the Adam variant of SGD', but it does not specify version numbers for these software components or libraries.
Experiment Setup	Yes	To list the hyperparameters we use in our Boost Res Net training after searching over candidate hyperparamters, we optimize learning rate to be 0.004 with a 9 10 5 learning rate decay. The gamma threshold is optimized to be 0.001 and the initial gamma value on SVHN is 0.75. On CIFAR-10 dataset, the main advantage of Boost Res Net over e2e BP is the speed of training. Boost Res Net reﬁned with e2e BP obtains comparable results with e2e BP. This is because we are using a suboptimal architecture of Res Net which overﬁts the CIFAR-10 dataset. Ada Boost, on the other hand, is known to be resistant to overﬁtting. In Boost Res Net training, we optimize learning rate to be 0.014 with a 3.46 10 5 learning rate decay. The gamma threshold is optimized to be 0.007 and the initial gamma value on CIFAR-10 is 0.93.