Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Composite Active Learning: Towards Multi-Domain Active Learning with Theoretical Guarantees
Authors: Guang-Yuan Hao, Hengguan Huang, Haotian Wang, Jie Gao, Hao Wang
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical results demonstrate that our approach significantly outperforms the state-of-the-art AL methods on both synthetic and real-world multi-domain datasets. |
| Researcher Affiliation | Collaboration | 1Hong Kong University of Science and Technology 2Rutgers University 3National University of Singapore 4Mohamed bin Zayed University of Artificial Intelligence 5JD Logistics |
| Pseudocode | No | The paper does not contain any explicit pseudocode blocks or sections labeled 'Algorithm'. |
| Open Source Code | Yes | Code is available at https://github.com/Wang-MLLab/multi-domain-active-learning. |
| Open Datasets | Yes | We use three real-world datasets: Office-Home (65 classes) (Venkateswara et al. 2017), Image CLEF (12 classes) (National Bureau of Statistics 2014), and Office-Caltech (10 classes) (Fernando et al. 2014). |
| Dataset Splits | No | The paper mentions splitting data into 'training and test sets' for the real-world datasets, and provides specific training and test set sizes for Rotating MNIST. However, it does not explicitly mention or provide details for a separate 'validation' dataset split for any experiment. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU specifications, or cloud computing instance types). |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | We run a model in R = 5 rounds plus an initial round, with three different random seeds, and report the average results over three seeds (see Sec. 1 of the Supplement for more details). In each round, we allocate a labeling budget of 200, 20, and 20, respectively for the three datasets. |