reproducibilityindex.ai

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables

Authors: Wangchunshu Zhou, Jinyi Hu, Hanlin Zhang, Xiaodan Liang, Maosong Sun, Chenyan Xiong, Jian Tang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two natural language understanding tasks demonstrate that our framework can not only make effective predictions in both supervised and semi-supervised settings, but also generate good natural language explanations.
Researcher Affiliation	Collaboration	1 Beihang University 2 Tsinghua University 3 South China University of Technology 4 Sun Yat-sen University 5 Microsoft Research 6 Mila-Québec AI Institute 7 HEC Montréal
Pseudocode	Yes	Algorithm 1: Explanation-based Self-Training (ELV-EST)
Open Source Code	Yes	Code is available at https://github.com/James Hujy/ELV.git
Open Datasets	Yes	We conduct experiments on two tasks: relation extraction (RE) and aspect-based sentiment classiﬁcation (ASC). For relation extraction we choose two datasets, TACRED [23] and Sem Eval [21] in our experiments. We use two customer review datasets, Restaurant and Laptop, which are part of Sem Eval 2014 Task 4 [24] for the aspect-based sentiment classiﬁcation task.
Dataset Splits	Yes	Table 1: Statistics of datasets. We present the size of train/dev/test sets for 4 datasets in both supervised and semi-supervised settings. Moreover, # Exp means the size of initial explanation sets. ... Sem Eval [21] 203 7,016 1,210 800 2,715
Hardware Specification	No	The paper mentions using 'BERT-base and Uni LM-base as the backbone of our prediction model and explanation generation model, respectively.' but does not specify any hardware details like GPU models, CPU types, or memory.
Software Dependencies	No	The paper mentions using 'BERT-base' and 'Uni LM-base' as backbone models, 'Sentence BERT [19]' for embeddings, and 'Adam optimizers'. However, it does not provide specific version numbers for these or other software libraries/frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We select batch size over {32, 64} and learning rate over {1e-5, 2e-5, 3e-5}. The number of retrieved explanations is set to 10 for all tasks. We train the prediction model for 3 epochs and the generation model for 5 epochs in each EM iteration. We use Adam optimizers and early stopping with the best validation F1-score.