Composite Active Learning: Towards Multi-Domain Active Learning with Theoretical Guarantees

Authors: Guang-Yuan Hao, Hengguan Huang, Haotian Wang, Jie Gao, Hao Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical results demonstrate that our approach significantly outperforms the state-of-the-art AL methods on both synthetic and real-world multi-domain datasets.
Researcher Affiliation Collaboration 1Hong Kong University of Science and Technology 2Rutgers University 3National University of Singapore 4Mohamed bin Zayed University of Artificial Intelligence 5JD Logistics
Pseudocode No The paper does not contain any explicit pseudocode blocks or sections labeled 'Algorithm'.
Open Source Code Yes Code is available at https://github.com/Wang-MLLab/multi-domain-active-learning.
Open Datasets Yes We use three real-world datasets: Office-Home (65 classes) (Venkateswara et al. 2017), Image CLEF (12 classes) (National Bureau of Statistics 2014), and Office-Caltech (10 classes) (Fernando et al. 2014).
Dataset Splits No The paper mentions splitting data into 'training and test sets' for the real-world datasets, and provides specific training and test set sizes for Rotating MNIST. However, it does not explicitly mention or provide details for a separate 'validation' dataset split for any experiment.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU specifications, or cloud computing instance types).
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes We run a model in R = 5 rounds plus an initial round, with three different random seeds, and report the average results over three seeds (see Sec. 1 of the Supplement for more details). In each round, we allocate a labeling budget of 200, 20, and 20, respectively for the three datasets.