reproducibilityindex.ai

Streaming Active Learning with Deep Neural Networks

Authors: Akanksha Saran, Safoora Yousefi, Akshay Krishnamurthy, John Langford, Jordan T. Ash

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the performance of Ve SSAL against several baselines on three academic benchmark datasets and one real-world dataset. In addition to measuring performance for a variety of architectures and batch sizes, we evaluate our approach in terms of robustness to feature drift in the data stream, and in terms of its fidelity to the predefined query rate.
Researcher Affiliation	Collaboration	1Microsoft Research NYC 2Microsoft Bing. Correspondence to: Akanksha Saran <akankshasaran@utexas.edu>.
Pseudocode	Yes	Algorithm 1 Volume sampling for streaming active learning (Ve SSAL)
Open Source Code	Yes	Code for the implementation of Ve SSAL can be found at https://github.com/asaran/Ve SSAL.git
Open Datasets	Yes	We evaluate all algorithms on three image benchmarks, namely SVHN (Netzer et al., 2011), MNIST (Le Cun et al., 1998), and CIFAR10 (Krizhevsky, 2009), and one real-world dataset from Bohus et al. (2022).
Dataset Splits	No	The paper describes train and test splits (e.g., 80% train / 20% test for CLOW) and mentions using a 'held-out test set' for other datasets, but does not explicitly specify a validation set split across all experiments for reproducibility.
Hardware Specification	No	The paper mentions that 'each algorithm was given identical computational resources' but does not provide specific details regarding the hardware (e.g., GPU/CPU models, memory, or cloud instance types) used for the experiments.
Software Dependencies	No	All methods are implemented in Py Torch (Paszke et al., 2017).
Experiment Setup	Yes	Models are trained with the Adam optimizer (Kingma & Ba, 2014) with a fixed learning rate 0.001 until they reach > 99% training accuracy. We experiment with different budgets per round k {100, 1K, 10K} for the benchmark datasets and k {10, 100, 1K} for the CLOW dataset (since it has a total of 11K training samples). In all experiments we start with 100 labeled samples and acquire the rest of the labeled samples via active learning. All methods are implemented in Py Torch (Paszke et al., 2017). We set λ in Algorithm 1 to .01 in all Ve SSAL experiments, which ensures a numerically stable inversion.