Context-Guided BERT for Targeted Aspect-Based Sentiment Analysis

Authors: Zhengxuan Wu, Desmond C. Ong14094-14102

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We train both models with pretrained BERT on two (T)ABSA datasets: Senti Hood and Sem Eval-2014 (Task 4). Both models achieve new state-of-the-art results with our QACG-BERT model having the best performance. Furthermore, we provide analyses of the impact of context in the our proposed models.
Researcher Affiliation Collaboration Zhengxuan Wu 1, Desmond C. Ong 2, 3 1 Symbolic Systems Program, Stanford University 2 Department of Information Systems and Analytics, National University of Singapore 3 Institute of High Performance Computing, Agency for Science, Technology, and Research, Singapore
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Figure 2 provides an illustration of the proposed models, but it is a diagram, not pseudocode.
Open Source Code Yes 1 https://github.com/frankaging/Quasi-Attention-ABSA
Open Datasets Yes For the TABSA task, we used the Sentihood dataset 5 which was built by questions and answers from Yahoo! with location names of London, UK. (Footnote 5: https://github.com/uclnlp/jack/tree/master/data/sentihood) For the ABSA task, we used the dataset from Sem Eval-2014 Task 4 6, which contains 3,044 sentences from restaurant reviews. (Footnote 6: http://alt.qcri.org/semeval2014/task4/)
Dataset Splits Yes Each dataset is partitioned to train, development and test sets as in its original paper.
Hardware Specification Yes We used a single Standard NC6 instance on Microsoft Azure, which is equipped with a single NVIDIA Tesla K80 GPU with 12G Memory.
Software Dependencies No The paper mentions using 'pretrained weights from the uncased BERT-base model' (footnote 7: https://storage.googleapis.com/bert_models/2020_02_20/uncased_L-12_H-768_A-12.zip) but does not provide specific version numbers for other ancillary software dependencies like programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries used for implementation.
Experiment Setup Yes Our models consists of 12 heads and 12 layers, with hidden layer size 768. ... We trained for 25 epochs with a dropout probability of 0.1. The initial learning rate is 2e 5 for all layers, with a batch size of 24.