reproducibilityindex.ai

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

Authors: Stefan Lee, Senthil Purushwalkam Shiva Prakash, Michael Cogswell, Viresh Ranjan, David Crandall, Dhruv Batra

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our approach achieves lower oracle error compared to existing methods on a wide range of tasks and deep architectures. We also show qualitatively that the diverse solutions produced often provide interpretable representations of task ambiguity.
Researcher Affiliation	Academia	Stefan Lee Virginia Tech steﬂee@vt.edu Senthil Purushwalkam Carnegie Mellon University spurushw@andrew.cmu.edu Michael Cogswell Virginia Tech cogswell@vt.edu Viresh Ranjan Virginia Tech rviresh@vt.edu David Crandall Indiana University djcran@indiana.edu Dhruv Batra Virginia Tech dbatra@vt.edu
Pseudocode	Yes	Figure 2: The MCL approach of [8] (Alg. 1) requires costly retraining while our s MCL method (Alg. 2) works within standard SGD solvers, training all ensemble members under a joint loss. (Algorithms 1 and 2 are present in the paper)
Open Source Code	No	The paper mentions utilizing publicly available implementations of other models (e.g., neuraltalk2, Caffe) and describes s MCL as a layer to introduce, but it does not explicitly state that the authors' implementation code for s MCL is open-source or provide a link to it.
Open Datasets	Yes	We begin our experiments with s MCL on the CIFAR10 [17] dataset...We use the fully convolutional network (FCN) architecture presented by Long et al. [20]...We train on the Pascal VOC 2011 training set...We adopt the model and training procedure of Karpathy et al. [14], utilizing their publicly available implementation neuraltalk2. The model...We train and test on the MSCOCO dataset [18], using the same splits as [14].
Dataset Splits	Yes	We train on the Pascal VOC 2011 training set augmented with extra segmentations provided in [10] and we test on a subset of the VOC 2011 validation set...We train and test on the MSCOCO dataset [18], using the same splits as [14].
Hardware Specification	No	The acknowledgments mention 'NVIDIA GPU donation' and 'Computing resources used by this work are supported', but they do not specify exact GPU models, CPU details, or other specific hardware configurations used for the experiments.
Software Dependencies	No	The paper mentions using 'Caffe deep learning framework [13]' and the 'publicly available implementation neuraltalk2' but does not specify version numbers for these or any other software libraries or dependencies.
Experiment Setup	Yes	For these experiments, the reference model is trained using a batch size of 350 for 5,000 iterations with a momentum of 0.9, weight decay of 0.004, and an initial learning rate of 0.001 which drops to 0.0001 after 4000 iterations...We initialize our s MCL models from a standard ensemble trained for 50 epochs at a learning rate of 10 3. The s MCL ensemble is then ﬁne-tuned for another 15 epochs at a reduced learning rate of 10 5...We train each ensemble for 70k iterations with the parameters of the CNN ﬁxed.