Batch Active Learning at Scale
Authors: Gui Citovsky, Giulia DeSalvo, Claudio Gentile, Lazaros Karydas, Anand Rajagopalan, Afshin Rostamizadeh, Sanjiv Kumar
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct large scale experiments using a Res Net-101 model applied to multi-label Open Images Dataset consisting of almost 10M images and 60M labels over 20K classes, to demonstrate significant improvement Cluster-Margin provides over the baselines. In the best result, we find that Cluster-Margin requires only 40% of the labels needed by the next best method to achieve the same target performance. To compare against latest published results, we follow their experimental settings and conduct smaller scale experiments using a VGG16 model on multiclass CIFAR10, CIFAR100, and SVHN datasets, and show Cluster-Margin algorithm s competitive performance. |
| Researcher Affiliation | Industry | Gui Citovsky, Giulia De Salvo, Claudio Gentile, Lazaros Karydas, Anand Rajagopalan, Afshin Rostamizadeh, Sanjiv Kumar Google Research {gcitovsky,giuliad,cgentile,lkary,anandbr,rostami,sanjivk}@google.com |
| Pseudocode | Yes | Algorithm 1 Hierarchical Agglomerative Clustering (HAC) with Average-Linkage. Algorithm 2 The Cluster-Margin Algorithm. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of its source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | We leverage the Open Images v6 image classification dataset [Krasin et al., 2017] to evaluate Cluster Margin and other active learning methods in the very large batch-size setting, i.e. batch-sizes of 100K and 1M. Specifically, we consider CIFAR10, CIFAR100, and SVHN, which are datasets that contain 32-by-32 color images [Krizhevsky, 2009, Netzer et al., 2011]. |
| Dataset Splits | Yes | Table 1: Open Images Dataset v6 statistics by data split. Images Positives Negatives Train 9,011,219 19,856,086 37,668,266 Validation 41,620 367,263 228,076 Test 125,436 1,110,124 689,759 |
| Hardware Specification | Yes | We train a Res Net-101 model implemented using tf-slim with batch SGD using 64 Cloud TPU v4 s each with two cores. |
| Software Dependencies | No | The paper mentions using 'tf-slim' and 'tf-keras library' but does not specify version numbers for these or any other software dependencies, which are necessary for full reproducibility. |
| Experiment Setup | Yes | We train a Res Net-101 model implemented using tf-slim with batch SGD using 64 Cloud TPU v4 s each with two cores. Each core is fed 48 examples per SGD iteration, resulting in an effective SGD batch of size 64 * 2 * 48 = 6144. The SGD optimizer decays the learning rate logarithmically after every 5 * 10^8 examples and uses an initial learning rate of 10^-4. We use batch SGD with learning rate fixed to 0.001 and SGD s batch size set to 100. |