reproducibilityindex.ai

Debiased and Denoised Entity Recognition from Distant Supervision

Authors: Haobo Wang, Yiwen Dong, Ruixuan Xiao, Fei Huang, Gang Chen, Junbo Zhao

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted to validate the Des ERT. The results show that our framework establishes a new state-of-art performance, it achieves a +2.22% average F1 score improvement on five standardized benchmarking datasets.
Researcher Affiliation	Collaboration	1Zhejiang University, Hangzhou, China 2Alibaba Group, Hangzhou, China
Pseudocode	Yes	The pseudo-code of Des ERT is summarized in Appendix D.
Open Source Code	No	The paper mentions using the Huggingface Transformer library but does not provide a specific link or explicit statement about releasing its own source code for the methodology described.
Open Datasets	Yes	We evaluate our framework on five widely-used named entity recognition benchmark datasets in the English language: (1) Co NLL03 [34]... (2) Onto Notes5.0 [35]... (3) Webpage [36]... (4) Wikigold [37]... (5) Twitter [38]
Dataset Splits	Yes	Table 10: The statistics of five datasets, show the number of entity types and the number of sentences in the Train/Dev/Test set. Co NLL03: Train 14,041 Dev 3,250 Test 3,453; Onto Notes5.0: Train 115,812 Dev 15,680 Test 12,217; Webpage: Train 385 Dev 99 Test 135; Wikigold: Train 1,142 Dev 280 Test 274; Twitter: Train 2,393 Dev 999 Test 3,844
Hardware Specification	Yes	All experiments are conducted on a workstation with 8 NVIDIA RTX A6000 GPUs.
Software Dependencies	No	The paper mentions adopting the "Huggingface Transformer library for the Ro BERTa-base (125M parameters) and Distil Ro BERTa-base (66M parameters) models", but it does not specify the version number of the Huggingface library or other key software dependencies.
Experiment Setup	Yes	Specifically, we train the networks for 50 epochs with a few epochs of warm-up, followed by 2 epochs of finetuning. The training batch size is set as 16 on four datasets, except 32 on Onto Notes5.0. The learning rate is fixed as {1e 5, 2e 5, 1e 5, 1e 5, 2e 5} for these datasets respectively. The confidence threshold parameter τ is tuned by 0.9 for Wikigold while 0.95 for others. The co-guessing is performed from the k-th epoch, which we set to {6, 40, 35, 30, 30} respectively. For finetuning, the learning rate is one-tenth of the original one.