Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Swapout: Learning an ensemble of deep architectures

Authors: Saurabh Singh, Derek Hoiem, David Forsyth

NeurIPS 2016 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experiment extensively on the CIFAR-10 dataset and demonstrate that a model trained with swapout outperforms a comparable Res Net model. Further, a 32 layer wider model matches the performance of a 1001 layer Res Net on both CIFAR-10 and CIFAR-100 datasets.
Researcher Affiliation Academia Department of Computer Science University of Illinois, Urbana-Champaign EMAIL
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link to open-source code for the methodology described.
Open Datasets Yes We experiment extensively on the CIFAR-10 dataset and demonstrate that a model trained with swapout outperforms a comparable Res Net model. Further, a 32 layer wider model matches the performance of a 1001 layer Res Net on both CIFAR-10 and CIFAR-100 datasets.
Dataset Splits No The paper mentions training on CIFAR-10 and CIFAR-100 and shows error rates, but it does not explicitly detail the training/validation/test splits, only that "Standard augmentation of left-right flips and random translations of up to four pixels is used." and "All the images in a mini-batch use the same crop."
Hardware Specification No The paper mentions
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes Training: We train using SGD with a batch size of 128, momentum of 0.9 and weight decay of 0.0001. Unless otherwise specified, we train all the models for a total 256 epochs. Starting from an initial learning rate of 0.1, we drop it by a factor of 10 after 192 epochs and then again after 224 epochs.