reproducibilityindex.ai

A Simple Yet Powerful Deep Active Learning With Snapshots Ensembles

Authors: Seohyeon Jung, Sanghyun Kim, Juho Lee

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through the extensive empirical comparison, we demonstrate the effectiveness of snapshot ensembles for deep active learning. Our code is available at: https://github.com/nannullna/snapshot-al. In this section, through an extensive empirical comparison on three image classification benchmarks (CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Tiny Image Net (Le and Yang, 2015)), we would like to demonstrate the followings: We compare the AL algorithms with three variables acquisition functions (VR, ME, BALD, MAR), algorithms to measure uncertainty (DE, SE, MCDO), and how to train the classifier in the final episode (single model via a vanilla SGD, DE). We report the results with Res Net-18 (He et al., 2016). Please refer to Appendix B for more details, such as experimental protocols or hyperparameter settings. The test accuracy results on CIFAR-10, CIFAR-100, and Tiny Image Net are summarized in Table 1, Table 2, and Table 3, respectively, according to the proportion of labeled examples.
Researcher Affiliation	Academia	Seohyeon Jung Sanghyun Kim Juho Lee Kim Jaechul Graduate School of AI Korea Advanced Institute of Science and Technology (KAIST) Daejeon, Republic of Korea {heon2203,nannullna,juholee}@kaist.ac.kr
Pseudocode	Yes	Algorithm 1 summarizes our SE-based AL algorithm, where the final classifier is obtained with vanilla Stochastic Gradient Descent (SGD), but DE can be applied instead. Algorithm 2 summarizes the AL with fine-tuning. The parts that are different from the AL without fine-tuning are marked as blue.
Open Source Code	Yes	Our code is available at: https://github.com/nannullna/snapshot-al. We will provide an open-source implementation of AL environments and our code of SE and SE + FT algorithms.
Open Datasets	Yes	In this section, through an extensive empirical comparison on three image classification benchmarks (CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Tiny Image Net (Le and Yang, 2015)), we would like to demonstrate the followings:
Dataset Splits	No	The paper mentions "test accuracy results" and "proportion of labeled examples" used for training, but it does not explicitly specify the size, percentage, or methodology for a distinct validation split used in their experiments. While standard datasets often have predefined validation sets, the paper itself does not detail its use or configuration.
Hardware Specification	Yes	Reported runtimes are based on an Ubuntu 20.04 server with an AMD Ryzen-4 5900X CPU and 64GB RAM, as well as an NVIDIA RTX-3090 GPU with 24GB VRAM.
Software Dependencies	No	We used the Pytorch (Paszke et al., 2019) library in our experiments and algorithms which are described in Algorithm 1 and Algorithm 2. This statement mentions PyTorch but does not provide a specific version number for the library used in their experiments.
Experiment Setup	Yes	Please refer to Appendix B for more details, such as experimental protocols or hyperparameter settings. We used a standard SGD optimizer with the following hyperparameters for both CIFAR-10 and CIFAR-100 datasets: a base learning rate of 0.001, momentum of 0.9, and weight decay of 0.01. The mini-batch size was set to 64 for CIFAR-10 and 128 for CIFAR-100. During SE, we raised the learning rate to 0.01 for CIFAR-10 or dropped it to 0.0001 for CIFAR-100. Table 8: Summary of hyperparameters (includes Optimizer, Base lr, Momentum, Weight decay, Scheduler, SE lr, Max epoch, SE epochs, # snapshots for various datasets and architectures).