Active Learning for Multiple Target Models

Authors: Ying-Peng Tang, Sheng-Jun Huang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results on the OCR benchmarks show that the proposed method can significantly surpass the traditional active and passive learning methods under this challenging setting.
Researcher Affiliation Academia College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics Collaborative Innovation Center of Novel Software Technology and Industrialization MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China {tangyp,huangsj}@nuaa.edu.cn
Pseudocode Yes Algorithm 1 The DIAM-online Algorithm
Open Source Code No The paper references a GitHub link (https://github.com/mit-han-lab/once-for-all) but it is for the OFA models (a third-party tool) used, not the authors' own implementation code for the proposed method.
Open Datasets Yes two commonly used hand-writing characters classification benchmarks are employed in our experiments, i.e., the MNIST [19] and Kuzushiji MNIST [8] datasets. They are under the CC BY-SA 3.0 and CC BY-SA 4.0 licenses, respectively.
Dataset Splits No The paper states 'we randomly take 3,000 training data as our initially labeled data, and the rest as the unlabeled pool,' which describes the initial active learning setup but does not specify a separate validation set or its split details.
Hardware Specification No The paper discusses target deployment devices (e.g., Samsung S7 Edge, Note8, Note10) for the models but does not specify the hardware used to run the experiments or train the models.
Software Dependencies No The paper mentions software components like 'SGD optimizer' and models like 'Mobile Net V3' but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes For the model training, We mainly follow the training configs of OFA. Specifically, the hyperparameters are set by the default values in the project. For example, the learning rate is set by 7.5e 3, batch size is 128, SGD optimizer is employed with momentum 0.9. Since the initially labeled data is limited, a small number of training epochs is taken to avoid over-fitting. Specifically, we employ the pretrained weights on the image-net dataset for initialization, then finetune 20 epochs on the labeled data.