Bayesian Generative Active Deep Learning
Authors: Toan Tran, Thanh-Toan Do, Ian Reid, Gustavo Carneiro
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | we provide theoretical and empirical evidence (MNIST, CIFAR-{10, 100}, and SVHN) that our approach has more efficient training and better classification results than data augmentation and active learning. We run experiments which show that our proposed Bayesian generative active deep learning is advantageous in terms of training efficiency and classification performance, compared with data augmentation and active learning on MNIST, CIFAR-{10, 100} and SVHN. |
| Researcher Affiliation | Academia | 1University of Adelaide, Australia 2University of Liverpool. Correspondence to: Toan Tran <toan.m.tran@adelaide.edu.au>. |
| Pseudocode | Yes | Algorithm 1 Bayesian Generative Active Learning |
| Open Source Code | Yes | code available at https://github.com/toantm/BGADL |
| Open Datasets | Yes | Our experiments are performed on MNIST (Le Cun et al., 1998), CIFAR-10, CIFAR-100 (Krizhevsky et al., 2012), and SVHN (Netzer et al., 2011). |
| Dataset Splits | No | The paper specifies initial training set sizes and the number of samples selected per acquisition iteration from an unlabeled pool. It mentions 'validation' as a key in the schema but does not explicitly provide percentages, absolute counts, or predefined splits for a dedicated validation dataset to reproduce the data partitioning. For instance, there is no mention of an '80/10/10 split' or specific numbers for training, validation, and test sets that are fixed from the outset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU specifications, or memory. |
| Software Dependencies | No | The paper mentions using 'stochastic gradient descent' and 'Adam optimizer' with specific parameters (learning rate, momentum, beta values) but does not specify version numbers for any software libraries, frameworks (e.g., TensorFlow, PyTorch), or programming languages used for implementation. |
| Experiment Setup | Yes | The training process was run with the following hyper-parameters: 1) the classifier c(x; θC) used stochastic gradient descent with (lr=0.01, momentum=0.9); 2) the encoder e(x; θE), generator g(z; θG) and discriminator d(x; θD) used Adam optimizer with (lr=0.0002, β1 = 0.5, β2 = 0.999); the mini-batch size is 100 for all cases. The sample acquisition setup for each data set is: 1) the number of samples in the initial training set is 1,000 for MNIST, 5,000 for CIFAR-10, 15,000 for CIFAR-100, and 10,000 for SVHN; 2) the number of acquisition iterations is 150 (50 for SVHN), where at each iteration 100 (500 for SVHN) samples are selected from 2,000 randomly selected samples of the unlabeled data set Dpool. |