Input Warping for Bayesian Optimization of Non-Stationary Functions
Authors: Jasper Snoek, Kevin Swersky, Rich Zemel, Ryan Adams
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the empirical study that forms the experimental part of this paper, we show that modeling non-stationarity is extremely important and yields significant empirical improvements in the performance of Bayesian optimization. |
| Researcher Affiliation | Academia | Jasper Snoek JSNOEK@SEAS.HARVARD.EDU School of Engineering and Applied Sciences, Harvard University Kevin Swersky KSWERSKY@CS.TORONTO.EDU Department of Computer Science, University of Toronto Richard Zemel ZEMEL@CS.TORONTO.EDU Department of Computer Science, University of Toronto and Canadian Institute for Advanced Research Ryan P. Adams RPA@SEAS.HARVARD.EDU School of Engineering and Applied Sciences, Harvard University |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. |
| Open Source Code | No | The paper mentions using a third-party package 'Deepnet' but does not provide concrete access to the source code for the methodology described in this paper. |
| Open Datasets | Yes | on a subset of the popular CIFAR-10 data set (Krizhevsky, 2009) |
| Dataset Splits | No | The paper mentions datasets and training examples but does not provide specific details on training, validation, and test splits (e.g., percentages, exact counts, or specific split files). |
| Hardware Specification | No | No specific hardware details (such as exact GPU/CPU models or memory) used for running the experiments are provided. |
| Software Dependencies | No | The paper mentions software packages like 'Deepnet' and 'Spearmint' but does not provide specific version numbers for these or other ancillary software components. |
| Experiment Setup | Yes | The deep network consists of three convolutional layers and two fully connected layers and we optimize over two learning rates, one for each layer type, six dropout regularization rates, six weight norm constraints, the number of hidden units per layer, a convolutional kernel size and a pooling size for a total of 21 hyperparameters. |