Active Learning Helps Pretrained Models Learn the Intended Task
Authors: Alex Tamkin, Dat Nguyen, Salil Deshpande, Jesse Mu, Noah Goodman
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We investigate whether pretrained models are better active learners, capable of choosing examples that improve robustness to such spurious correlations and domain shifts. Intriguingly, we find that better active learning is an emergent property of the pretraining process: pretrained models require up to 5 fewer labels when using uncertainty-based active learning, while non-pretrained models see no or even negative benefit. We consider the use of active learning (AL) on a range of real-world image and text datasets where task ambiguity arises. We compare several AL acquisition functions against a random-sampling baseline, and compare the difference in performance with and without the use of pretrained models. |
| Researcher Affiliation | Academia | Stanford University |
| Pseudocode | No | The paper describes the active learning procedure in narrative text and provides a mathematical formula for the acquisition function, but it does not include a formally labeled 'Pseudocode' or 'Algorithm' block, nor structured steps formatted like code. |
| Open Source Code | Yes | Code and training scripts are available at: https://github.com/alextamkin/active-learning-pretrained-models. |
| Open Datasets | Yes | We consider a variety of datasets where task ambiguity manifests through a scarcity of particular kinds of examples. We consider two such kinds of examples: those defined by combinations of causal and spurious features (typical vs atypical backgrounds) as well as those defined by unseen attributes that shift during deployment (product categories and camera trap locations). These datasets provide an empirical testbed for the ability of pretrained models to choose disambiguating examples using active learning (AL). Waterbirds [49], Treeperson (created from Visual Genome [33]), i Wild Cam [5, 31], Amazon-WILDS [43, 31]. |
| Dataset Splits | No | The paper discusses using a seed set and acquiring data from an unlabeled pool. It states its method aims at 'Removing the need for a separate validation set' for early stopping, but still refers to 'validation datasets' for plotting results. However, it does not provide explicit percentages or sample counts for the training, validation, and test splits needed to reproduce the initial partitioning of the datasets. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper refers to models like BiT and RoBERTa, and mentions using 'standard learning rates and other hyperparameters recommended by model developers' (Appendix B), but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Appendix B: Experimental Details provides specific experimental setup details including seed set sizes (e.g., 'Waterbirds: 100 examples / class'), acquisition step sizes (e.g., 'Waterbirds: 20 examples'), learning rates ('Vision: 1e-4', 'Text: 5e-5'), batch sizes ('Vision: 64', 'Text: 32'), optimizer ('Adam'), weight decay ('1e-4'), and the finetuning termination heuristic ('stop finetuning when the training loss decreases to 0.1% of the original training loss'). |