Prune and Tune Ensembles: Low-Cost Ensemble Learning with Sparse Independent Subnetworks

Authors: Tim Whitaker, Darrell Whitley8638-8646

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark our approach against state of the art low-cost ensemble methods and display marked improvement in both accuracy and uncertainty estimation on CIFAR-10 and CIFAR-100.
Researcher Affiliation Academia Tim Whitaker, Darrell Whitley Department of Computer Science Colorado State University Fort Collins, CO 80525 timothy.whitaker@colostate.edu, whitley@cs.colostate.edu
Pseudocode No The paper contains mathematical formulas (e.g., for CD(A,B) and ηt) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper states 'Datasets and models are open source and linked in the appendix.' but does not provide a specific link or explicit statement for the availability of the authors' own source code for the methodology described.
Open Datasets Yes Datasets: We use the computer vision datasets, CIFAR10 and CIFAR-100 (Krizhevsky 2012).
Dataset Splits No The paper explicitly states the split into '50,000 training images and 10,000 testing images' for CIFAR, but does not provide specific details for a validation set split.
Hardware Specification Yes Hardware: All models are trained on a single Nvidia GTX-1080-Ti GPU.
Software Dependencies No The paper mentions software components like ADAM optimizer, but it does not specify version numbers for any software dependencies required to replicate the experiments.
Experiment Setup Yes We use Stochastic Gradient Descent (SGD) with Nesterov momentum for our evaluation of random/anti-random pruning and constant/cyclic tuning schedule. We use an initial learning rate of η1 = 0.1 for 50% of the training budget which decays linearly to η2 = 0.001 at 90% of the training budget. The learning rate is kept constant at η2 = 0.001 for the final 10% of training. Children are tuned with either a constant learning rate of η = 0.01 for 5 epochs or with a one-cycle schedule that ramps up from η1 = 0.001 to η2 = 0.1 at 10% of tuning, then decaying to η3 = 1e 7 at the end of 5 epochs. For all other experiments, we use ADAM with a learning rate of η = 0.001.