reproducibilityindex.ai

TrustAL: Trustworthy Active Learning Using Knowledge Distillation

Authors: Beong-woo Kwak, Youngwook Kim, Yu Jin Kim, Seung-won Hwang, Jinyoung Yeo7263-7271

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that the Trust AL framework significantly improves performance with various data acquisition strategies while preserving the valuable knowledge from the labeled dataset. We validate the pseudo labels from the predecessor models are not just approximate/weak predictions It can be viewed as knowledge from the previous generation, and can be used as consistency regularization for conventional AL methods solely aiming at higher accuracy.
Researcher Affiliation	Academia	1 Department of Artificial Intelligence, Yonsei University 2 Department of Computer Science, Yonsei University 3 Department of Computer Science and Engineering, Seoul National University
Pseudocode	Yes	Algorithm 1: Conventional AL procedure; Algorithm 2: Trust AL
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a repository for the described methodology.
Open Datasets	Yes	Dataset We use three text classification datasets, TREC (Roth et al. 2002), Movie review (Pang and Lee 2005) and SST-2 (Socher et al. 2013), which are widely used in AL (Lowell, Lipton, and Wallace 2018; Siddhant and Lipton 2018; Yuan, Lin, and Boyd-Graber 2020) and statistically diverse.
Dataset Splits	No	The paper mentions using a 'fixed development dataset Ddev' and reports 'Validation accuracy', but does not provide specific percentages or counts for training, validation, and test splits in the main text.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models, or memory specifications.
Software Dependencies	No	The paper mentions 'Bi LSTM' as the base model architecture but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	For all three datasets, we follow the commonly used default settings in AL for text classification (Liu et al. 2021; Zhou et al. 2021; Lowell, Lipton, and Wallace 2018; Siddhant and Lipton 2018): Bi LSTM (Hochreiter and Schmidhuber 1997) is adopted as a base model architecture; In each iteration of AL, training a classification model from scratch (not by incremental manner) with the entire labeled samples gathered, to avoid the training issues with warm-starting (Ash and Adams 2020). Note that the development set is held out in every experiments so that it is not used for training models. We describe our implementation details in Appendix C. We report the average values with five random seeds.