reproducibilityindex.ai

Learning to Rank Learning Curves

Authors: Martin Wistuba, Tejaswini Pedapati

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare our ranking method with respect to a ranking measure against different methods on ﬁve different image classiﬁcation and four tabular regression datasets. We also show that our method is capable of signiﬁcantly accelerating neural architecture search (NAS) and hyperparameter optimization. Furthermore, we conduct several ablation studies to provide a better motivation of our model and its behavior.
Researcher Affiliation	Industry	1IBM Research. Correspondence to: Martin Wistuba <martin.wistuba@ibm.com>.
Pseudocode	Yes	Algorithm 1 Early Termination Method
Open Source Code	No	The paper does not contain an explicit statement about making the source code for the described methodology publicly available, nor does it provide a link to a code repository.
Open Datasets	Yes	We compare our method to similar methods on ﬁve different datasets: CIFAR-10, CIFAR-100, Fashion-MNIST, Quickdraw, and SVHN. ... To create the meta-knowledge, we choose 200 architectures per dataset at random from the NASNet search space (Zoph et al., 2018)... For the experiments in Section 4.6 we rely on the tabular benchmark (Klein & Hutter, 2019).
Dataset Splits	Yes	We use the original train/test splits if available. Quickdraw has a total of 50 million data points and 345 classes. To reduce the training time, we select a subset of this dataset. We use 100 different randomly selected classes and choose 300 examples per class for the training split and 100 per class for the test split. 5,000 random data points of the training dataset serve as validation split for all datasets.
Hardware Specification	No	The paper mentions 'GPU hours' for computational effort (e.g., '20 GPU hours were searched', '36 GPU hours'), but does not specify any particular GPU model or other hardware specifications used for running the experiments.
Software Dependencies	No	The paper mentions software components and algorithms like Adam and CNNs, but does not specify version numbers for any programming languages, libraries, or frameworks used in the experiments (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	Each architecture is trained for 100 epochs with stochastic gradient descent and cosine learning rate schedule without restart (Loshchilov & Hutter, 2017). ... All parameters of the layers in f are trained jointly by means of Adam (Kingma & Ba, 2015) by minimizing L = αLce + (1 α) Lrec (4) a weighted linear combination of the ranking loss (Equation (3)) and the reconstruction loss with α = 0.8. ... For our experiment we set δ = 0.45 which means that if the predicted probability that the new model is better than the best one is below 45%, the run is terminated early.