Towards Interpretable Natural Language Understanding with Explanations as Latent Variables
Authors: Wangchunshu Zhou, Jinyi Hu, Hanlin Zhang, Xiaodan Liang, Maosong Sun, Chenyan Xiong, Jian Tang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on two natural language understanding tasks demonstrate that our framework can not only make effective predictions in both supervised and semi-supervised settings, but also generate good natural language explanations. |
| Researcher Affiliation | Collaboration | 1 Beihang University 2 Tsinghua University 3 South China University of Technology 4 Sun Yat-sen University 5 Microsoft Research 6 Mila-Québec AI Institute 7 HEC Montréal |
| Pseudocode | Yes | Algorithm 1: Explanation-based Self-Training (ELV-EST) |
| Open Source Code | Yes | Code is available at https://github.com/James Hujy/ELV.git |
| Open Datasets | Yes | We conduct experiments on two tasks: relation extraction (RE) and aspect-based sentiment classification (ASC). For relation extraction we choose two datasets, TACRED [23] and Sem Eval [21] in our experiments. We use two customer review datasets, Restaurant and Laptop, which are part of Sem Eval 2014 Task 4 [24] for the aspect-based sentiment classification task. |
| Dataset Splits | Yes | Table 1: Statistics of datasets. We present the size of train/dev/test sets for 4 datasets in both supervised and semi-supervised settings. Moreover, # Exp means the size of initial explanation sets. ... Sem Eval [21] 203 7,016 1,210 800 2,715 |
| Hardware Specification | No | The paper mentions using 'BERT-base and Uni LM-base as the backbone of our prediction model and explanation generation model, respectively.' but does not specify any hardware details like GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper mentions using 'BERT-base' and 'Uni LM-base' as backbone models, 'Sentence BERT [19]' for embeddings, and 'Adam optimizers'. However, it does not provide specific version numbers for these or other software libraries/frameworks (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We select batch size over {32, 64} and learning rate over {1e-5, 2e-5, 3e-5}. The number of retrieved explanations is set to 10 for all tasks. We train the prediction model for 3 epochs and the generation model for 5 epochs in each EM iteration. We use Adam optimizers and early stopping with the best validation F1-score. |