reproducibilityindex.ai

Uncertainty-Aware Self-Training for Low-Resource Neural Sequence Labeling

Authors: Jianing Wang, Chengyu Wang, Jun Huang, Ming Gao, Aoying Zhou

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments over six benchmarks demonstrate that our Seq UST framework effectively improves the performance of self-training, and consistently outperforms strong baselines by a large margin in low-resource scenarios.
Researcher Affiliation	Collaboration	1 School of Data Science and Engineering, East China Normal University, Shanghai, China 2 Alibaba Group, Hangzhou, China 3 KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, China
Pseudocode	Yes	Algorithm 1: Self-training Procedure of Seq UST
Open Source Code	No	The paper does not explicitly state that the source code is released or provide a link to a code repository.
Open Datasets	Yes	We choose six widely used benchmarks to evaluate our Seq UST framework, including SNIPS (Coucke et al. 2018) and Multiwoz (Budzianowski et al. 2018) for slot filing, MIT Movie (Liu et al. 2013b), MIT Restaurant (Liu et al. 2013a), Co NLL-03 (Sang and Meulder 2003) and Onto Notes (Weischedel et al. 2013) for NER.
Dataset Splits	Yes	For each dataset, we use a greedy-based sampling strategy to randomly select 10-shot labeled data per class for the few-shot labeled training set and validation set, while the remaining data are viewed as unlabeled data.
Hardware Specification	Yes	We implement our framework in Pytorch 1.8 and use NVIDIA V100 GPUs for experiments.
Software Dependencies	Yes	We implement our framework in Pytorch 1.8 and use NVIDIA V100 GPUs for experiments.
Experiment Setup	Yes	For each dataset, we use a greedy-based sampling strategy to randomly select 10-shot labeled data per class for the few-shot labeled training set and validation set, while the remaining data are viewed as unlabeled data. During self-training, the teacher and student model share the same model architecture. In default, we choose BERT-base-uncased (Devlin et al. 2019) from Hugging Face with a softmax layer as the base encoder. We use grid search to search the hyper-parameters. We select five different random seeds for the dataset split and training settings among {12, 21, 42, 87, 100}.