reproducibilityindex.ai

Active Testing: Sample-Efficient Model Evaluation

Authors: Jannik Kossen, Sebastian Farquhar, Yarin Gal, Tom Rainforth

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate its effectiveness on models including Wide Res Nets and Gaussian processes on datasets including Fashion-MNIST and CIFAR-100.
Researcher Affiliation	Academia	1OATML, Department of Computer Science, 2Department of Statistics, Oxford. Correspondence to: Jannik Kossen <jannik.kossen@cs.ox.ac.uk>.
Pseudocode	Yes	Algorithm 1 Active Testing Input: Model f trained on data Dtrain
Open Source Code	Yes	Full details as well as additional results are provided in the appendix, and we release code for reproducing the results at github.com/jlko/active-testing.
Open Datasets	Yes	We demonstrate its effectiveness on models including Wide Res Nets and Gaussian processes on datasets including Fashion-MNIST and CIFAR-100. ... on the MNIST dataset (Le Cun et al., 1998) ... Fashion-MNIST (Xiao et al., 2017) ... CIFAR-10 (Krizhevsky et al., 2009).
Dataset Splits	No	The paper mentions using 'training data' and 'test data' extensively but does not specify a separate 'validation' split or its proportions. While standard datasets often have predefined splits, the paper does not explicitly state the validation split used for its experiments.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments (e.g., specific GPU or CPU models).
Software Dependencies	No	The paper mentions software like 'Pytorch' (Paszke et al., 2019) and 'scikit-learn' (Pedregosa et al., 2011) but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	No	The paper describes the models, surrogates, and datasets used, but does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations required for direct reproduction.