Active Learning on a Budget: Opposite Strategies Suit High and Low Budgets

Authors: Guy Hacohen, Avihu Dekel, Daphna Weinshall

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In a comparative empirical investigation of supervised learning, using a variety of architectures and image datasets, Typi Clust outperforms all other active learning strategies in the low-budget regime. Using Typi Clust in the semisupervised framework, performance gets an even more significant boost. In particular, state-of-the-art semi-supervised methods trained on CIFAR10 with 10 labeled examples selected by Typi Clust, reach 93.2% accuracy an improvement of 39.4% over random selection.
Researcher Affiliation Academia 1 School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel 2 Edmond & Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel. Correspondence to: Guy Hacohen <guy.hacohen@mail.huji.ac.il>, Avihu Dekel <avihu.dekel@mail.huji.ac.il>, Daphna Weinshall <daphna@cs.huji.ac.il>.
Pseudocode Yes Algorithm 1 Typi Clust initial pooling algorithm Input: Unlabeled pool U, Budget B Output: B typical and diverse examples to query Embedding Representation_Learning(U) Clust Clustering_algorithm(Embedding, B) Queries for all i = 1, ..., B do Add arg maxx Clust[i]{Typicality(x)} to Queries end for return Queries
Open Source Code Yes Code is available at https://github.com/avihu111/Typi Clust.
Open Datasets Yes All strategies are evaluated on the following image classification tasks: CIFAR-10/100 (Krizhevsky et al., 2009), Tiny Image Net (Le & Yang, 2015) and Image Net50/100/200. The latter group includes subsets of Image Net (Deng et al., 2009) containing 50/100/200 classes respectively, following Van Gansbeke et al. (2020).
Dataset Splits No The paper specifies training on labeled sets and reports test accuracy, but does not provide explicit details about a separate validation set split (e.g., specific percentages or counts for training, validation, and test sets).
Hardware Specification No The paper mentions fitting models into a "standard GPU virtual memory" but does not specify any particular GPU model, CPU, or other hardware components used for the experiments.
Software Dependencies No The paper mentions various software components and libraries like Sim CLR, SCAN, DINO, K-Means, scikit-learn KMeans, Mini Batch KMeans, ResNet18, SGD, Flex Match, Wide Res Net-28, Res Net-50, and VGG-19, but it does not specify version numbers for any of them.
Experiment Setup Yes We trained Sim CLR using the code provided by Van Gansbeke et al. (2020) for CIFAR-10, CIFAR-100 and Tiny Image Net. Specifically, we used Res Net18 with an MLP projection layer to a 128 vector, trained for 500 epochs. All the training hyper-parameters were identical to those used by SCAN. After training, we used the 512 dimensional penultimate layer as the representation space. As in SCAN, we used an SGD optimizer with 0.9 momentum, and an initial learning rate of 0.4 with a cosine scheduler. The batch size was 512 and weight decay of 0.0001. The augmentations were random resized crops, random horizontal flips, color jittering, and random grayscaling.