Active Learning for Multiple Target Models
Authors: Ying-Peng Tang, Sheng-Jun Huang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results on the OCR benchmarks show that the proposed method can significantly surpass the traditional active and passive learning methods under this challenging setting. |
| Researcher Affiliation | Academia | College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics Collaborative Innovation Center of Novel Software Technology and Industrialization MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China {tangyp,huangsj}@nuaa.edu.cn |
| Pseudocode | Yes | Algorithm 1 The DIAM-online Algorithm |
| Open Source Code | No | The paper references a GitHub link (https://github.com/mit-han-lab/once-for-all) but it is for the OFA models (a third-party tool) used, not the authors' own implementation code for the proposed method. |
| Open Datasets | Yes | two commonly used hand-writing characters classification benchmarks are employed in our experiments, i.e., the MNIST [19] and Kuzushiji MNIST [8] datasets. They are under the CC BY-SA 3.0 and CC BY-SA 4.0 licenses, respectively. |
| Dataset Splits | No | The paper states 'we randomly take 3,000 training data as our initially labeled data, and the rest as the unlabeled pool,' which describes the initial active learning setup but does not specify a separate validation set or its split details. |
| Hardware Specification | No | The paper discusses target deployment devices (e.g., Samsung S7 Edge, Note8, Note10) for the models but does not specify the hardware used to run the experiments or train the models. |
| Software Dependencies | No | The paper mentions software components like 'SGD optimizer' and models like 'Mobile Net V3' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For the model training, We mainly follow the training configs of OFA. Specifically, the hyperparameters are set by the default values in the project. For example, the learning rate is set by 7.5e 3, batch size is 128, SGD optimizer is employed with momentum 0.9. Since the initially labeled data is limited, a small number of training epochs is taken to avoid over-fitting. Specifically, we employ the pretrained weights on the image-net dataset for initialization, then finetune 20 epochs on the labeled data. |