reproducibilityindex.ai

Active Surrogate Estimators: An Active Learning Approach to Label-Efficient Model Evaluation

Authors: Jannik Kossen, Sebastian Farquhar, Yarin Gal, Thomas Rainforth

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We next study the performance of ASEs for active testing in comparison to relevant baselines. Concretely, we compare to naive MC and the current state-of-the-art LURE-based active testing approach by Kossen et al. [38].
Researcher Affiliation	Collaboration	Jannik Kossen1 Sebastian Farquhar1,3 Yarin Gal1 Tom Rainforth2 1 OATML, Department of Computer Science, University of Oxford 2 Department of Statistics, University of Oxford 3 Deep Mind
Pseudocode	Yes	Algorithm 1 Adaptive Refinement of ASEs
Open Source Code	Yes	We release code for ASEs in the supplement.
Open Datasets	Yes	Concretely, we are given a fixed model trained on 2000 digits of MNIST [42]... In each case, we train the model to be evaluated, f, on a training set containing 40 000 points, and then use an evaluation pool of size N = 2000.
Dataset Splits	No	The paper specifies training and test/evaluation set sizes but does not explicitly detail a separate validation set size or splitting methodology for validation data.
Hardware Specification	Yes	All experiments were run on a local machine with 8 NVIDIA 2080 Ti GPUs and 2 Intel Xeon E5-2630 v4 CPUs, 256GB of RAM.
Software Dependencies	Yes	Experiments were run using PyTorch 1.10.0 [52] and Python 3.9 [66].
Experiment Setup	No	The main text states, 'We give full details on the experiments in the appendix. In particular, B and C contain additional results and figures, and D gives further details on the computation of XWED and the baselines.' Specific numerical hyperparameters are thus detailed in the appendix, not the main text.