reproducibilityindex.ai

Active Learning on Pre-trained Language Model with Task-Independent Triplet Loss

Authors: Seungmin Seo, Donghyun Kim, Youbin Ahn, Kyong-Ho Lee11276-11284

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To assess the effectiveness of the proposed method, we compare the proposed method with state-of-the-art active learning methods on two tasks, relation extraction and sentence classiﬁcation. Experimental results show that our method outperforms baselines on the benchmark datasets.
Researcher Affiliation	Academia	Seungmin Seo, Donghyun Kim, Youbin Ahn, and Kyong-Ho Lee Department of Computer Science, Yonsei University, Seoul, Republic of Korea
Pseudocode	Yes	Algorithm 1: Active learning with BATL
Open Source Code	No	The paper does not provide any explicit statement or link for open-source code availability.
Open Datasets	Yes	For relation extraction, we used two publicly accessible dataset, NYT-10 (Riedel, Yao, and Mc Callum 2010) and Wiki-KBP (Ellis et al. 2013). ... For sentence classiﬁcation, we used two benchmark datasets, AG News (Zhang, Zhao, and Le Cun 2015) and Pub Med (Dernoncourt and Lee 2017).
Dataset Splits	No	Table 1 provides 'Train' and 'Test' splits with explicit numbers for each dataset (e.g., NYT-10: Train 522,611, Test 172,448), but it does not explicitly state a separate 'validation' dataset split.
Hardware Specification	Yes	The experiments are performed on Ge Force RTX 2080 Ti and AMD Ryzen 7 3700X CPUs.
Software Dependencies	No	The paper mentions models and frameworks like GPT, BERT, SCIBERT, but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	We evaluated sampling strategies on the relation extraction with varying batch size K = {500, 2000} for NYT-10, and K = {50, 200} for Wiki KBP. We set the batch size K = 100 for sentence classiﬁcation. The learning rate is 2e 5, and scaling parameter λ = 1.