Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Deep Active Learning with Adaptive Acquisition
Authors: Manuel Haussmann, Fred Hamprecht, Melih Kandemir
IJCAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on three benchmark vision data sets from different domains and complexities: MNIST for images of handwritten digits, Fashion MNIST for greyscale images of clothes, and CIFAR-10 for colored natural images. [...] We summarize the results in Table 1. |
| Researcher Affiliation | Collaboration | 1HCI/IWR, Heidelberg University, Germany 2Bosch Center for Artificial Intelligence, Renningen, Germany |
| Pseudocode | Yes | Algorithm 1: The RAL training procedure |
| Open Source Code | Yes | see github.com/manuelhaussmann/ral for a reference pytorch implementation of the proposed model. |
| Open Datasets | Yes | We evaluate our method on three benchmark vision data sets from different domains and complexities: MNIST for images of handwritten digits, Fashion MNIST for greyscale images of clothes, and CIFAR-10 for colored natural images. |
| Dataset Splits | No | The straight-forward reward would be the performance of the updated predictor on a separate validation set. This, however, clashes with the constraint imposed on us by the active learning scenario. ... Hence, we abandon this option altogether. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions 'pytorch implementation' in a footnote, implying the use of PyTorch, but does not provide specific version numbers for any software dependencies like PyTorch, Python, or CUDA. |
| Experiment Setup | Yes | To evaluate the performance of the proposed pipeline, we take as the predictor is a standard Le Net5 sized model (two convolutional layers of 20, 50 channels and two linear layers of 500, 10 neurons) and as the guide a policy net consisting of two layers with 500 hidden neurons. [...] The predictor is trained for 30 epochs between labeling rounds (labeling five points per round), while the policy net gets one update step after each round. [...] In each experiment the state is constructed by ranking the unlabeled data according to their predictive entropy and then taking every twentieth point until M = 50 points. [...] We stop after having collected 400 points starting from an initial set of 50 data points. |