Prune and Tune Ensembles: Low-Cost Ensemble Learning with Sparse Independent Subnetworks
Authors: Tim Whitaker, Darrell Whitley8638-8646
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We benchmark our approach against state of the art low-cost ensemble methods and display marked improvement in both accuracy and uncertainty estimation on CIFAR-10 and CIFAR-100. |
| Researcher Affiliation | Academia | Tim Whitaker, Darrell Whitley Department of Computer Science Colorado State University Fort Collins, CO 80525 timothy.whitaker@colostate.edu, whitley@cs.colostate.edu |
| Pseudocode | No | The paper contains mathematical formulas (e.g., for CD(A,B) and ηt) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states 'Datasets and models are open source and linked in the appendix.' but does not provide a specific link or explicit statement for the availability of the authors' own source code for the methodology described. |
| Open Datasets | Yes | Datasets: We use the computer vision datasets, CIFAR10 and CIFAR-100 (Krizhevsky 2012). |
| Dataset Splits | No | The paper explicitly states the split into '50,000 training images and 10,000 testing images' for CIFAR, but does not provide specific details for a validation set split. |
| Hardware Specification | Yes | Hardware: All models are trained on a single Nvidia GTX-1080-Ti GPU. |
| Software Dependencies | No | The paper mentions software components like ADAM optimizer, but it does not specify version numbers for any software dependencies required to replicate the experiments. |
| Experiment Setup | Yes | We use Stochastic Gradient Descent (SGD) with Nesterov momentum for our evaluation of random/anti-random pruning and constant/cyclic tuning schedule. We use an initial learning rate of η1 = 0.1 for 50% of the training budget which decays linearly to η2 = 0.001 at 90% of the training budget. The learning rate is kept constant at η2 = 0.001 for the final 10% of training. Children are tuned with either a constant learning rate of η = 0.01 for 5 epochs or with a one-cycle schedule that ramps up from η1 = 0.001 to η2 = 0.1 at 10% of tuning, then decaying to η3 = 1e 7 at the end of 5 epochs. For all other experiments, we use ADAM with a learning rate of η = 0.001. |