reproducibilityindex.ai

PMRC: Prompt-Based Machine Reading Comprehension for Few-Shot Named Entity Recognition

Authors: Jin Huang, Danfeng Yan, Yuanqiang Cai

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that our approach outperforms state-of-the-art models in low-resource settings, achieving an average performance improvement of +5.2% in settings where access to source domain data is limited.
Researcher Affiliation	Academia	Jin Huang, Danfeng Yan*, Yuanqiang Cai Beijing University of Posts and Telecommunications Xitucheng Road 10, Beijing, China jinhuang@bupt.edu.cn, yandf@bupt.edu.cn, caiyuanqiang@bupt.edu.cn
Pseudocode	No	The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present any structured steps formatted like code.
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We evaluated the performance of PMRC in the rich-resource setting on the Co NLL03 (Sang and Meulder 2003) dataset. ... Specifically, we employed the Co NLL03 dataset, a general domain dataset, as a resource-rich source domain data. We randomly sampled a subset of training instances from the MIT Movie (Liu et al. 2013), MIT Restaurant (Liu et al. 2013), and ATIS (Hakkani-T ur et al. 2016) datasets to serve as the training data for the target domain.
Dataset Splits	No	To ensure a truly low-resource setting, we eliminate the validation set setting used in previous works (e.g., Template NER (Cui et al. 2021), Light NER (Chen et al. 2022a)). This means that our model is trained on a small training set for a certain number of steps and directly tested on the test set, making our scenario more closely resemble real-world conditions. For the rich-resource setting, it mentions 'evaluation was conducted after 20 epochs. We selected the model with the best performance on the validation set and evaluated it on the test set,' but does not specify the validation split size or percentage.
Hardware Specification	Yes	The experiments were conducted using Py Torch on a single Nvidia 3090 GPU.
Software Dependencies	No	The paper mentions 'Py Torch' as the framework used, but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	If no specific model is mentioned, we use BERT-baseuncased (Devlin et al. 2019) as the backbone model. ... All optimizations were performed using the Adam W optimizer with a linear warmup schedule in the Standard Supervised Setting and a cosine with restarts schedule in the few-shot scenarios. The focal loss function was used with α = 0.25 and γ = 2. Additionally, a weight decay of 0.01 was applied to all nonbias parameters. Rich-Resource Setting: We fixed the batch size at 16 and set the learning rate to 2e-5. The model was trained for 30 epochs, and evaluation was conducted after 20 epochs. Low-Resource Setting: For all experiments, we fixed the batch size at 4, and the model was trained for 30 epochs before being directly tested. The learning rate is set to 1e-4 for the MIT Movie (Liu et al. 2013) and ATIS (Hakkani-T ur et al. 2016) datasets, and 5e-5 for the MIT Restaurant (Liu et al. 2013) dataset.