Bayesian Generative Active Deep Learning

Authors: Toan Tran, Thanh-Toan Do, Ian Reid, Gustavo Carneiro

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental we provide theoretical and empirical evidence (MNIST, CIFAR-{10, 100}, and SVHN) that our approach has more efficient training and better classification results than data augmentation and active learning. We run experiments which show that our proposed Bayesian generative active deep learning is advantageous in terms of training efficiency and classification performance, compared with data augmentation and active learning on MNIST, CIFAR-{10, 100} and SVHN.
Researcher Affiliation Academia 1University of Adelaide, Australia 2University of Liverpool. Correspondence to: Toan Tran <toan.m.tran@adelaide.edu.au>.
Pseudocode Yes Algorithm 1 Bayesian Generative Active Learning
Open Source Code Yes code available at https://github.com/toantm/BGADL
Open Datasets Yes Our experiments are performed on MNIST (Le Cun et al., 1998), CIFAR-10, CIFAR-100 (Krizhevsky et al., 2012), and SVHN (Netzer et al., 2011).
Dataset Splits No The paper specifies initial training set sizes and the number of samples selected per acquisition iteration from an unlabeled pool. It mentions 'validation' as a key in the schema but does not explicitly provide percentages, absolute counts, or predefined splits for a dedicated validation dataset to reproduce the data partitioning. For instance, there is no mention of an '80/10/10 split' or specific numbers for training, validation, and test sets that are fixed from the outset.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU specifications, or memory.
Software Dependencies No The paper mentions using 'stochastic gradient descent' and 'Adam optimizer' with specific parameters (learning rate, momentum, beta values) but does not specify version numbers for any software libraries, frameworks (e.g., TensorFlow, PyTorch), or programming languages used for implementation.
Experiment Setup Yes The training process was run with the following hyper-parameters: 1) the classifier c(x; θC) used stochastic gradient descent with (lr=0.01, momentum=0.9); 2) the encoder e(x; θE), generator g(z; θG) and discriminator d(x; θD) used Adam optimizer with (lr=0.0002, β1 = 0.5, β2 = 0.999); the mini-batch size is 100 for all cases. The sample acquisition setup for each data set is: 1) the number of samples in the initial training set is 1,000 for MNIST, 5,000 for CIFAR-10, 15,000 for CIFAR-100, and 10,000 for SVHN; 2) the number of acquisition iterations is 150 (50 for SVHN), where at each iteration 100 (500 for SVHN) samples are selected from 2,000 randomly selected samples of the unlabeled data set Dpool.