reproducibilityindex.ai

Drop Redundant, Shrink Irrelevant: Selective Knowledge Injection for Language Pretraining

Authors: Ningyu Zhang, Shumin Deng, Xu Cheng, Xi Chen, Yichi Zhang, Wei Zhang, Huajun Chen

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on benchmark datasets demonstrate that our approach can enhance state-of-the-art knowledge injection methods.
Researcher Affiliation	Collaboration	Ningyu Zhang1,2 , Shumin Deng 1,2 , Xu Cheng3 , Xi Chen5 , Yichi Zhang4 , Wei Zhang4 , Huajun Chen1,2 1 Zhejiang University & AZFT Joint Lab for Knowledge Engine 2 Hangzhou Innovation Center, Zhejiang University 3 National Engineering Laboratory for Improving the Government s Governance Capability Big Data Application Technology 4 Alibaba Group 5 Tencent
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not explicitly state that source code for the described methodology is being released or provide a link to a code repository.
Open Datasets	Yes	TACRED [Zhang et al., 2017] is a large-scale relation extraction dataset that covers 42 relation types and contains 106,264 sentences. Open Entity [Choi et al., 2018] is a completely manually annotated entity typing dataset. Search QA [Dunn et al., 2017] is a large-scale question answering dataset that is constructed to reﬂect a full pipeline of general question answering. Quasa-T [Dhingra et al., 2017] is a large-scale questionanswering dataset consisting of 43,000 open-domain trivia questions and their answers that are obtained from various internet sources. GLUE [Wang et al., 2019a] is a benchmark with nine diverse NLP tasks.
Dataset Splits	No	The paper mentions several datasets (TACRED, Open Entity, Search QA, Quasar-T, GLUE) but only explicitly refers to the 'GLUE dev set' in Table 2. It does not provide specific training/validation/test splits (e.g., percentages or sample counts) for all datasets used, nor does it cite predefined splits for all of them.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using pre-trained models like BERT-base and RoBERTa-base, and implementing ERNIE and Know BERT, but does not provide specific version numbers for software dependencies or libraries (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	In this case, η was set to 0.001, k was set to 1, λ was set to 0.5, γ was set to 0.0001, Khop/thresh/min/max was set to {6,100,5,20}, and the batch size was set to 32.