Training-Free Neural Active Learning with Initialization-Robustness Guarantees

Authors: Apivich Hemachandra, Zhongxiang Dai, Jasraj Singh, See-Kiong Ng, Bryan Kian Hsiang Low

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically demonstrate that our EV-GP criterion is highly correlated with both initialization robustness and generalization performance, and show that it consistently outperforms baseline methods in terms of both desiderata, especially in situations with limited initial data or large batch sizes.
Researcher Affiliation Academia 1Department of Computer Science, National University of Singapore, Republic of Singapore 2School of Computer Science and Engineering, Nanyang Technological University, Republic of Singapore. Correspondence to: Zhongxiang Dai <dzx@nus.edu.sg>.
Pseudocode Yes Algorithm 1 EV-GP+MS
Open Source Code Yes The code for the experiments can be found at https://github.com/apivich-h/init-robust-al.
Open Datasets Yes Meanwhile the real-life training data are taken from the UCI Machine Learning Repository (Dua & Graff). ... MNIST (Deng, 2012). ... EMNIST (Cohen et al., 2017). ... SVHN (Netzer et al., 2011). ... CIFAR100 (Krizhevsky, 2009).
Dataset Splits No No explicit training/validation/test split percentages or sample counts provided. The paper mentions splitting data into a 'pool' and 'test data' but doesn't detail validation splits or the training percentage of the pool.
Hardware Specification No No specific hardware details (e.g., CPU/GPU models, memory specifications) used for running experiments are provided in the paper.
Software Dependencies No The paper mentions software like JAX, Neural-Tangents, PyTorch, FUNCTOOLS, and Adam optimizer, but does not provide specific version numbers for these dependencies.
Experiment Setup Yes In all of the regression experiments, the model used is a two-layer multi-layer perceptron with width of 512 and with bias. We set σW = 1. and σb = 0.1. The NNs are optimized using gradient descent with step size 0.01. ... For all the models, we train the models using stochastic gradient descent with learning rate of 0.1 and weight decay of 0.005. The models are trained with training batch size of 32 and are trained for 100 epochs.