Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

AutoAL: Automated Active Learning with Differentiable Query Strategy Search

Authors: Yifeng Wang, Xueying Zhan, Siyu Huang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that Auto AL consistently achieves superior accuracy compared to all candidate AL algorithms and other selective AL approaches, showcasing its potential for adapting and integrating multiple existing AL methods across diverse tasks and domains. ... We conduct AL experiments on seven datasets: Cifar-10 and Cifar-100 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011), Tiny Image Net (Le & Yang, 2015) in the nature image domain, and Organ CMNIST, Path MNIST, and Tissue MNIST from Med MNIST database (Yang et al., 2023) in the medical image domain. ... Fig. 2 shows the overall performance comparison of different AL methods, where Auto AL consistently outperforms the baselines across all datasets. ... Additionally, we conduct ablation studies to analyze the contribution of each module and examine the effects of the candidate AL strategies, including different numbers of candidates.
Researcher Affiliation	Academia	1Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, USA 2Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, Pittsburgh, USA 3Visual Computing Division, School of Computing, Clemson University, Clemson, USA. Correspondence to: Xueying Zhan <EMAIL>.
Pseudocode	Yes	Algorithm 1 Auto AL: Automated Active Learning with Differentiable Query Strategy Search Input: K candidate algorithms A = {Aκ}κ [K], labeled pool L = {(xj, yj)}M j=1 and unlabled pool U = {xi}N i=1, total number of AL rounds R, batch size b, task model T . Output: task model T 1: Initialize T 2: for c=1,...,R do 3: Optimize Ω F , Ω S according to Eq. 1; 4: Calculate S(xi) by Eq. 6 for all i N; 5: Select the top b samples with the highest score S(xj); 6: Query label yi for all i b; Update L and U; 7: Train task model T by using L; 8: end for 9: return T
Open Source Code	Yes	Code is available at: https://github.com/haizailache999/AutoAL.
Open Datasets	Yes	We conduct AL experiments on seven datasets: Cifar-10 and Cifar-100 (Krizhevsky et al., 2009), SVHN (Netzer et al., 2011), Tiny Image Net (Le & Yang, 2015) in the nature image domain, and Organ CMNIST, Path MNIST, and Tissue MNIST from Med MNIST database (Yang et al., 2023) in the medical image domain.
Dataset Splits	Yes	A. Appendix Table 1. A summarization of datasets used in the experiments. ... Cifar-10 50,000 10,000 10 1.0 Cifar-100 50,000 10,000 100 1.0 SVHN 73,257 26,032 10 3.0 Tiny Image Net 100,000 10,000 200 1.0 Medical Images Organ CMNIST 12,975 8,216 11 5.0 Path MNIST 89,996 7,180 9 1.6 Tissue MNIST 165,466 47,280 8 9.1
Hardware Specification	Yes	In this subsection, we perform additional analysis experiments to showcase Auto AL s efficiency and generalizability. We mainly use the average running time to verify the results, and all the experiments was done on one Nvidia A100 GPU.
Software Dependencies	No	For ΩF and ΩS, we build the backbone using Res Net-18 (He et al., 2016). ... For the optimization of ΩS and loss prediction module, we use SGD optimizer (Ruder, 2016). For the optimization of ΩF , we use Adam optimizer (Kingma, 2014). Both use 0.005 as the learning rate. While training, Fit Net will first update for 200 epochs using the validation queue, then ΩF , ΩS and the loss prediction module will update iteratively with a total of 400 epochs. All experiments are repeated three times with different randomly selected initial labeled pools, reporting mean and standard deviation.
Experiment Setup	Yes	Implementation Details. For ΩF and ΩS, we build the backbone using Res Net-18 (He et al., 2016). We also employ Res Net-18 as the classification model on all baselines and Auto AL for fair comparison. Since Auto AL is built upon existing AL strategies and focuses on selecting the optimal strategy, we integrate seven AL methods into Auto AL: Maximum Entropy, Margin Sampling, Least Confidence, KMeans, BALD, Var Ratio, and Mean STD. For the optimization of ΩS and loss prediction module, we use SGD optimizer (Ruder, 2016). For the optimization of ΩF , we use Adam optimizer (Kingma, 2014). Both use 0.005 as the learning rate. While training, Fit Net will first update for 200 epochs using the validation queue, then ΩF , ΩS and the loss prediction module will update iteratively with a total of 400 epochs. All experiments are repeated three times with different randomly selected initial labeled pools, reporting mean and standard deviation.