Building Interpretable Interaction Trees for Deep NLP Models
Authors: Die Zhang, Hao Zhang, Huilin Zhou, Xiaoyi Bao, Da Huo, Ruizhao Chen, Xu Cheng, Mengyue Wu, Quanshi Zhang14328-14337
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results have provided a new perspective to understand these DNNs, and have demonstrated the effectiveness of our method. |
| Researcher Affiliation | Academia | Die Zhang, Hao Zhang, Huilin Zhou, Xiaoyi Bao, Da Huo, Ruizhao Chen, Xu Cheng, Mengyue Wu, Quanshi Zhang* Shanghai Jiao Tong University {zizhan52, 1603023-zh, zhouhuilin116, zjbaoxiaoyi}@sjtu.edu.cn {sjtuhuoda, stelledge, xcheng8, mengyuewu, zqs1022}@sjtu.edu.cn |
| Pseudocode | No | The paper describes its algorithms and mathematical formulations in text and equations, but it does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | We learned DNNs for binary sentiment classification based on the SST-2 dataset (Socher et al. 2013), and learned DNNs to predict whether a sentence was linguistically acceptable based on the Co LA dataset (Warstadt, Singh, and Bowman 2018). |
| Dataset Splits | No | The paper mentions using well-known datasets like SST-2 and CoLA, which typically have predefined splits, but it does not explicitly state the specific percentages or counts for training, validation, or test splits needed for reproduction. |
| Hardware Specification | No | The paper does not specify any hardware components (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of various DNN architectures (BERT, ELMo, LSTM, CNN, Transformer) but does not provide specific version numbers for any software libraries, frameworks, or programming languages used. |
| Experiment Setup | Yes | For each task, we learned five DNNs, including the BERT (Devlin et al. 2018), the ELMo (Peters et al. 2018), the CNN proposed in (Kim 2014), the two-layer unidirectional LSTM (Hochreiter and Schmidhuber 1997), and the Transformer (Vaswani et al. 2017). quantified the contribution of each word/constituent φN\S {[S]}([S]) (i.e. φa in Equation (9)) to the model prediction with sampling times T = 2000 during the construction of the tree. |