Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning
Authors: Andreas Kirsch, Joost van Amersfoort, Yarin Gal
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we start by showing how a naive application of the BALD algorithm to an image dataset can lead to poor results in a dataset with many (near) duplicate data points, and show that Batch BALD solves this problem in a grounded way while obtaining favourable results (figure 2). |
| Researcher Affiliation | Academia | Andreas Kirsch Joost van Amersfoort Yarin Gal OATML Department of Computer Science University of Oxford EMAIL |
| Pseudocode | Yes | Algorithm 1: Greedy Batch BALD 1 1/e-approximate algorithm |
| Open Source Code | Yes | We provide an open-source implementation2. 2https://github.com/BlackHC/BatchBALD |
| Open Datasets | Yes | We then illustrate Batch BALD s effectiveness on standard AL datasets: MNIST and EMNIST. EMNIST [6] is an extension of MNIST that also includes letters, for a total of 47 classes, and has a twice as large training set. |
| Dataset Splits | Yes | As the labelled dataset is small in the beginning, it is important to avoid overfitting. We do this by using early stopping after 3 epochs of declining accuracy on the validation set. We pick the model with the highest validation accuracy. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU models or CPU specifications. |
| Software Dependencies | No | The paper mentions software like PyTorch and Adam optimizer but does not specify their version numbers, which is required for a reproducible description of software dependencies. |
| Experiment Setup | Yes | Throughout our experiments, we use the Adam [22] optimiser with learning rate 0.001 and betas 0.9/0.999. All our results report the median of 6 trials, with lower and upper quartiles. We use 100 MC dropout samples. We use 10 MC dropout samples. We use 50 MC dropout samples. |