reproducibilityindex.ai

Community Question Answering Entity Linking via Leveraging Auxiliary Data

Authors: Yuhan Li, Wei Shen, Jianbo Gao, Yadong Wang

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the superiority of our framework through extensive experiments over a newly released CQAEL data set against state-of-the-art entity linking methods.
Researcher Affiliation	Academia	Yuhan Li , Wei Shen , Jianbo Gao and Yadong Wang TMCC, TKLNDST, College of Computer Science, Nankai University, Tianjin, China
Pseudocode	No	The paper describes methods in text and provides mathematical equations, but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	We release the data set and codes to facilitate the research towards this new task4. 4https://github.com/yh Leeee/CQA Entity Linking
Open Datasets	Yes	We create a new data set named Quora EL to support the study of the CQAEL task. We release the data set and codes to facilitate the research towards this new task4. 4https://github.com/yh Leeee/CQA Entity Linking
Dataset Splits	Yes	We use 5-fold cross-validation and split the CQA texts into training (70%), validation (10%), and testing (20%).
Hardware Specification	Yes	All experiments are implemented by Mind Spore Framework6 with two NVIDIA Geforce GTX 3090 (24GB) GPUs.
Software Dependencies	No	The paper mentions 'Mind Spore Framework', 'xlnet-base-cased', and 'longformer-base-4096 models', and 'Adam W optimizer' but does not provide specific version numbers for the Mind Spore framework or the general software dependencies (like Python, PyTorch/TensorFlow, etc.) beyond the specific model variants.
Experiment Setup	Yes	For training, we adopt Adam W [Loshchilov and Hutter, 2018] optimizer with a warmup rate 0.1, an initial learning rate 1e-5, and a mini-batch size 2. Dropout with a probability of 0.1 is used to alleviate over-ﬁtting. For the base module, the maximum sequence length is set to 128. For the auxiliary data module, maximum lengths of the candidate entity description and each text are set to 128 and 64, respectively. The hyperparameter k is set to 3, whose impact to the performance will be studied later.