reproducibilityindex.ai

Meta-learning Hyperparameter Performance Prediction with Neural Processes

Authors: Ying Wei, Peilin Zhao, Junzhou Huang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on extensive Open ML datasets and three computer vision datasets demonstrate that the proposed algorithm achieves state-of-the-art performance in at least one order of magnitude less trials.
Researcher Affiliation	Collaboration	1Department of Computer Science, City University of Hong Kong, Hong Kong 2Tencent AI Lab, Shenzhen, China.
Pseudocode	Yes	Algorithm 1 Transferable Neural Processes (TNP) for Hyperparameter Optimization
Open Source Code	No	The paper does not contain any statement or link indicating that the source code for the methodology is openly available.
Open Datasets	Yes	First of all, we consider the Open ML (Vanschoren et al., 2014) platform which contains a large number of datasets covering a wide range of applications. Besides Open ML, we also investigate the effectiveness of TNP on three popular computer vision datasets, including CIFAR-10 (Krizhevsky & Hinton, 2009), MNIST (Le Cun et al., 1995), and SVHN (Netzer et al., 2011).
Dataset Splits	Yes	The training, validation, and test sets of each dataset are exactly the same as Open ML provides. ... We take the last 10,000, 10,000, and 6,000 training instances as the validation set for CIFAR-10, MNIST, and SVHN, respectively.
Hardware Specification	No	The paper mentions 'CPU overhead time' and 'GPU' in a general sense but does not provide specific details such as model numbers, processor types, or memory specifications for the hardware used in experiments.
Software Dependencies	No	The paper mentions using 'Adam' for optimization, but it does not specify version numbers for any software components, libraries, or programming languages used.
Experiment Setup	Yes	The encoder, the decoder, and the attention embedding function g are all implemented as a two-layer multilayer perceptron with r = 128 hidden units. ... We set the batch size, the number of gradient steps k, the learning rate α for Adam, and the meta update rate ϵ to be 64, 10, 1e-5, and 0.01, respectively. We summarize all hyperparameter settings of TNP in Appendix C.1.