Context-Guided BERT for Targeted Aspect-Based Sentiment Analysis
Authors: Zhengxuan Wu, Desmond C. Ong14094-14102
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We train both models with pretrained BERT on two (T)ABSA datasets: Senti Hood and Sem Eval-2014 (Task 4). Both models achieve new state-of-the-art results with our QACG-BERT model having the best performance. Furthermore, we provide analyses of the impact of context in the our proposed models. |
| Researcher Affiliation | Collaboration | Zhengxuan Wu 1, Desmond C. Ong 2, 3 1 Symbolic Systems Program, Stanford University 2 Department of Information Systems and Analytics, National University of Singapore 3 Institute of High Performance Computing, Agency for Science, Technology, and Research, Singapore |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. Figure 2 provides an illustration of the proposed models, but it is a diagram, not pseudocode. |
| Open Source Code | Yes | 1 https://github.com/frankaging/Quasi-Attention-ABSA |
| Open Datasets | Yes | For the TABSA task, we used the Sentihood dataset 5 which was built by questions and answers from Yahoo! with location names of London, UK. (Footnote 5: https://github.com/uclnlp/jack/tree/master/data/sentihood) For the ABSA task, we used the dataset from Sem Eval-2014 Task 4 6, which contains 3,044 sentences from restaurant reviews. (Footnote 6: http://alt.qcri.org/semeval2014/task4/) |
| Dataset Splits | Yes | Each dataset is partitioned to train, development and test sets as in its original paper. |
| Hardware Specification | Yes | We used a single Standard NC6 instance on Microsoft Azure, which is equipped with a single NVIDIA Tesla K80 GPU with 12G Memory. |
| Software Dependencies | No | The paper mentions using 'pretrained weights from the uncased BERT-base model' (footnote 7: https://storage.googleapis.com/bert_models/2020_02_20/uncased_L-12_H-768_A-12.zip) but does not provide specific version numbers for other ancillary software dependencies like programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries used for implementation. |
| Experiment Setup | Yes | Our models consists of 12 heads and 12 layers, with hidden layer size 768. ... We trained for 25 epochs with a dropout probability of 0.1. The initial learning rate is 2e 5 for all layers, with a batch size of 24. |