Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Enhancing Cost Efficiency in Active Learning with Candidate Set Query

Authors: Yeho Gwon, Sehyun Hwang, Hoyoung Kim, Jungseul Ok, Suha Kwak

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations on CIFAR-10, CIFAR-100, and Image Net64x64 demonstrate the effectiveness and scalability of our framework. Notably, it reduces labeling cost by 48% on Image Net64x64. The project page can be found at https://yehogwon.github.io/csq-al. We verify the effectiveness and generalizability of CSQ through extensive experiments with varying datasets, acquisition functions, and budgets.
Researcher Affiliation	Academia	Yeho Gwon EMAIL Department of Computer Science and Engineering POSTECH; Sehyun Hwang EMAIL Department of Computer Science and Engineering POSTECH; Hoyoung Kim EMAIL Graduate School of Artificial Intelligence POSTECH; Jungseul Ok EMAIL Graduate School of Artificial Intelligence POSTECH; Suha Kwak EMAIL Graduate School of Artificial Intelligence POSTECH
Pseudocode	Yes	Algorithm 1 Cost-efficient active learning with candidate set query
Open Source Code	No	The paper states: "The project page can be found at https://yehogwon.github.io/csq-al." This is a project page, which is considered a demonstration or overview page, not a direct link to a source-code repository containing the methodology's implementation.
Open Datasets	Yes	We use three image classification datasets: CIFAR-10 (Krizhevsky et al., 2009), CIFAR-100 (Krizhevsky et al., 2009), and Image Net64x64 (Chrabaszcz et al., 2017). The R52 dataset (Lewis, 1997) is a subset of the Reuters-21578 (Lewis, 1997) news collection.
Dataset Splits	Yes	CIFAR-10 comprises 50K training and 10K validation images across 10 classes. CIFAR-100 contains the same number of images as CIFAR-10, but is associated with 100 classes. Image Net64x64... consists of 1.2M training and 50K validation images with 1000 classes. In the initial round, we randomly sample 1K images for CIFAR-10, 5K images for CIFAR-100, and 60K images for Image Net64x64. We set the size of the calibration dataset ncal to 500 for CIFAR-10 and CIFAR-100, and 5K for Image Net64x64.
Hardware Specification	Yes	We trained our classification model on CIFAR-10 and CIFAR100 using NVIDIA RTX 3090 and on Image Net64x64 using 4 NVIDIA A100 GPUs in parallel.
Software Dependencies	No	The paper mentions various software components and models such as ResNet18, AdamW, Mix-up, WRN-36-5, SVM classifier, TF-IDF, and RoBERTa-Large, but does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	For CIFAR-10 and CIFAR-100, we adopt Res Net18 (He et al., 2016) as a classification model. We train it for 200 epochs using Adam W (Loshchilov & Hutter, 2019) optimizer with an initial learning rate of 1e 3, decreasing by a factor of 0.2 at epochs 60, 120, and 160. We apply a weight decay of 5e 4 and a data augmentation consisting of random crop, random horizontal flip, and random rotation. For Image Net64x64, we adopt WRN-36-5 (Zagoruyko, 2016), and train it for 30 epochs using Adam W optimizer with an initial learning rate of 8e 3. We apply a learning rate warm-up for 10 epochs from 2e 3. After the warm-up, we decay the learning rate by a factor of 0.2 every 10 epochs. We adopt random horizontal flip and random translation as data augmentation. For all the datasets, we use Mix-up (Zhang et al., 2018), where a mixing ratio is sampled from Beta(1, 1). We set the size of the calibration dataset ncal to 500 for CIFAR-10 and CIFAR-100, and 5K for Image Net64x64. For all datasets and acquisition functions, hyperparameter d in Eq. (8) is set to 0.3.