reproducibilityindex.ai

Scalable Bayesian Optimization Using Deep Neural Networks

Authors: Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Mostofa Patwary, Mr Prabhat, Ryan Adams

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of DNGO on a number of difficult problems, including benchmark problems for Bayesian optimization, convolutional neural networks for object recognition, and multi-modal neural language models for image caption generation. We find hyperparameter settings that achieve competitive with state-of-the-art results of 6.37% and 27.4% on CIFAR-10 and CIFAR-100 respectively, and BLEU scores of 25.1 and 26.7 on the Microsoft COCO 2014 dataset using a single model and a 3-model ensemble.
Researcher Affiliation	Collaboration	Jasper Snoek JSNOEK@SEAS.HARVARD.EDU Oren Rippel RIPPEL@MATH.MIT.EDU Kevin Swersky KSWERSKY@CS.TORONTO.EDU Ryan Kiros RKIROS@CS.TORONTO.EDU Nadathur Satish NADATHUR.RAJAGOPALAN.SATISH@INTEL.COM Narayanan Sundaram NARAYANAN.SUNDARAM@INTEL.COM Md. Mostofa Ali Patwary MOSTOFA.ALI.PATWARY@INTEL.COM Prabhat PRABHAT@LBL.GOV Ryan P. Adams RPA@SEAS.HARVARD.EDU Harvard University, School of Engineering and Applied Sciences Massachusetts Institute of Technology, Department of Mathematics University of Toronto, Department of Computer Science Intel Labs, Parallel Computing Lab NERSC, Lawrence Berkeley National Laboratory
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	Available at https://github.com/orippel/micmat
Open Datasets	Yes	We optimize the hyperparameters of the log-bilinear model (LBL) from Kiros et al. (2014) to maximize the BLEU score of a validation set from the recently released COCO dataset (Lin et al., 2014). [...] We tune the hyperparameters of a deep convolutional neural network on the CIFAR-10 and CIFAR-100 datasets.
Dataset Splits	Yes	We optimize the hyperparameters of the log-bilinear model (LBL) from Kiros et al. (2014) to maximize the BLEU score of a validation set from the recently released COCO dataset (Lin et al., 2014). [...] We optimized these over a validation set of 10,000 examples drawn from the training set, running each network for 200 epochs.
Hardware Specification	Yes	We performed the optimization on a cluster of Intel R Xeon Phi TM coprocessors, with 40 jobs running in parallel using a kernel library that has been highly optimized for efficient computation on the Intel R Xeon Phi TM coprocessor3. [...] The image caption generation computations in this paper were run on the Odyssey cluster supported by the FAS Division of Science, Research Computing Group at Harvard University.
Software Dependencies	No	The paper mentions a 'kernel library' but does not specify software dependencies with version numbers (e.g., Python version, specific deep learning frameworks and their versions).
Experiment Setup	Yes	We optimize learning parameters such as learning rate, momentum and batch size; regularization parameters like dropout and weight decay for word and image representations; and architectural parameters such as the context size, whether to use the additive or multiplicative version, the size of the word embeddings and the multi-modal representation size. [...] For this architecture, we tuned the momentum, learning rate, ℓ2 weight decay coefficients, dropout rates, standard deviations of the random i.i.d. Gaussian weight initializations, and corruption bounds for various data augmentations: global perturbations of hue, saturation and value, random scalings, input pixel dropout and random horizontal reflections.