Residual Networks Behave Like Ensembles of Relatively Shallow Networks

Authors: Andreas Veit, Michael J. Wilber, Serge Belongie

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental All experiments are performed at test time on CIFAR-10 [12]. Experiments on Image Net [2] show comparable results. We train residual networks with the standard training strategy, dataset augmentation, and learning rate policy, [6].
Researcher Affiliation Academia Andreas Veit Michael Wilber Serge Belongie Department of Computer Science & Cornell Tech Cornell University {av443, mjw285, sjb344}@cornell.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (link, explicit statement of release, or mention in supplementary materials) to source code for the methodology described.
Open Datasets Yes All experiments are performed at test time on CIFAR-10 [12]. Experiments on Image Net [2] show comparable results.
Dataset Splits No The paper mentions using CIFAR-10 and ImageNet but does not explicitly provide specific dataset split information for training, validation, or test sets (percentages, sample counts, or citations to predefined splits) beyond implicitly using standard datasets.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup No The paper mentions using 'standard training strategy, dataset augmentation, and learning rate policy' and the number of layers/modules for networks (e.g., '110-layer (54-module) residual network'), but it does not provide concrete hyperparameter values or detailed training configurations like specific learning rates, batch sizes, or optimizer settings.