reproducibilityindex.ai

Question-Driven Span Labeling Model for Aspect–Opinion Pair Extraction

Authors: Lei Gao, Yulong Wang, Tongcun Liu, Jingyu Wang, Lei Zhang, Jianxin Liao12875-12883

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments conducted on three tasks (ATE, ASOE, and AOPE) on four benchmark datasets demonstrate that the proposed method signiﬁcantly outperforms state-of-the-art approaches.
Researcher Affiliation	Collaboration	Lei Gao,1,3 Yulong Wang,1,3 Tongcun Liu,2,3 Jingyu Wang,1,3 Lei Zhang,1,3 Jianxin Liao1,3 1State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications 2School of Information Engineering, Zhejiang A & F University 3EBUPT Information Technology Co., Ltd. {gaolei 1, wangyulong, liutongcun, wangjingyu, zhanglei, liaojianxin}@ebupt.com
Pseudocode	No	The paper describes the model in detail with equations but does not provide structured pseudocode or an algorithm block.
Open Source Code	No	The paper does not contain any statement about releasing source code or a link to a code repository.
Open Datasets	Yes	We evaluated the performance of our proposed QDSL model on four public datasets obtained from Sem Eval 2014 Task 4, Sem Eval 2015 Task 12, and Sem Eval 2016 Task 5. These datasets are widely used in ABSA tasks. We use S14l, S14r, S15r, and S16r to denote Sem Eval-2014 Laptops, Sem Eval-2014 Restaurants, Sem Eval-2015 Restaurants, and Sem Eval-2016 Restaurants datasets, respectively. Fan et al. (2019) annotated the Sem Eval dataset with the corresponding opinion words for each aspect term.
Dataset Splits	Yes	For the experiment on the ATE subtask, we used the original Sem Eval datasets to compare our method with the other methods and keep the ofﬁcial data division of these datasets for the training, validation, and testing sets. Similar to previous works (Fan et al. 2019; Wu et al. 2020), we randomly split 20% of the training set as the validation set.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions BERTBASE-UNCASED and Adam W optimizer, but does not provide specific version numbers for software libraries or frameworks used (e.g., PyTorch, TensorFlow, or a specific BERT library version).
Experiment Setup	Yes	Based on the ﬁne-tuning hyperparameters suggested by Devlin et al. (2019), the learning rate was set to 5e-5, the batch size was set to 16, and the dropout probability was set to 0.1. We used Adam W (Loshchilov and Hutter 2019) to optimize the model parameters. In the experiment, we trained a total of 40 epochs and selected the best performing model in the validation set for testing.