SEL-BALD: Deep Bayesian Active Learning with Selective Labels
Authors: Ruijiang Gao, Mingzhang Yin, Maytal Saar-Tsechansky
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on both synthetic and real-world datasets to demonstrate the effectiveness of our proposed algorithms. |
| Researcher Affiliation | Academia | Ruijiang Gao Naveen Jindal School of Management University of Texas at Dallas Richardson, TX 75082 ruijiang.gao@utdallas.edu Mingzhang Yin Warrington College of Business University of Florida Gainesville, FL 32611 mingzhang.yin@warrington.ufl.edu Maytal Saar-Tsechansky Information, Risk, and Operations Management University of Texas at Austin Austin, TX 78712 maytal@mail.utexas.edu |
| Pseudocode | Yes | Algorithm 1 Bayesian Active Learning for Selective Labeling with Instance Rejection (SEL-BALD) |
| Open Source Code | Yes | The code is available at https://github.com/ruijiang81/SEL-BALD. |
| Open Datasets | Yes | We conduct experiments on both synthetic and real-world datasets. ... For our case study, we use the Give-Me-some-Credit (GMC) dataset [Credit Fusion, 2011]. ... We also examine each active learning method on a high-dimensional dataset MNIST [Le Cun, 1998]. ... More Real-World Datasets: We compare the proposed methods with baselines on Fashion MNIST [Xiao et al., 2017], CIFAR-10 [Krizhevsky, 2009], Adult [Becker and Kohavi, 1996] and Mushroom [mus, 1981] datasets in Appendix E. |
| Dataset Splits | No | The paper mentions 'training set' and 'test set' sizes for the synthetic, GMC, and MNIST datasets (e.g., '3700 samples as the training set and around 1600 samples as the test set' for synthetic data), but it does not explicitly specify a separate 'validation' split with percentages or sample counts for hyperparameter tuning. |
| Hardware Specification | Yes | We run the experiments on a server with 3 Nvidia A100 graphics cards and AMD EPYC 7763 64-Core Processor. |
| Software Dependencies | No | The paper mentions software components like 'Bayesian neural network', 'MC-dropout', and 'Adam optimizer', but it does not provide specific version numbers for any of these or for broader frameworks (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | For the predictive model and human discretion model class, we use a Bayesian neural network and use MC-dropout [Gal et al., 2017] to approximate the posterior (we set the number of MC samples as 40 in all experiments). The model architecture is a 3-layer fully connected neural network with Leaky ReLU activation function. We use the Adam optimizer with a learning rate of 0.01. ... We set β = 0.75 for Joint-BALD-UCB in all experiments. ... The results are averaged over 3 runs with a query size of 10, 50 randomly examined instances initially, and a budget of 450. (GMC) ... The results are averaged over 3 runs with a query size of 20, 100 randomly examined instances initially, and a budget of 1000. (MNIST) |