D-Separation for Causal Self-Explanation

Authors: Wei Liu, Jun Wang, Haozhao Wang, Ruixuan Li, Zhiying Deng, YuanKai Zhang, Yang Qiu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate that MCD improves the F1 score by up to 13.7% compared to previous state-of-the-art MMI-based methods. Our code is available at: https://github.com/jugechengzi/Rationalization-MCD. 5 Experiments 5.1 Datasets and metrics 5.2 Baselines and implementation details 5.3 Results
Researcher Affiliation Collaboration Wei Liu1 Jun Wang2 Haozhao Wang1 Ruixuan Li1 Zhiying Deng1 Yuankai Zhang1 Yang Qiu1 1School of Computer Science and Technology, Huazhong University of Science and Technology 2i Wudao Tech 1{idc_lw, hz_wang, rxli, dengzhiyingdd, yuankai_zhang, anders}@hust.edu.cn 2jwang@iwudao.tech
Pseudocode No The paper includes architectural diagrams (Figure 3) and mentions PyTorch implementation in the appendix, but it does not contain a dedicated pseudocode or algorithm block.
Open Source Code Yes Our code is available at: https://github.com/jugechengzi/Rationalization-MCD.
Open Datasets Yes Datasets 1) Beer Advocate (Mc Auley et al., 2012) is a multi-aspect sentiment prediction dataset widely adopted in rationalization studies. 2) Hotel Reviews (Wang et al., 2010) is another multi-aspect sentiment classification dataset containing less feature correlation
Dataset Splits No The paper uses datasets and mentions training, but does not explicitly provide details about training/validation/test splits (e.g., percentages or sample counts).
Hardware Specification Yes All models are trained on a RTX3090 GPU.
Software Dependencies No The paper mentions software components like GloVe, GRUs, Gumbel-softmax, and Adam, but it does not provide specific version numbers for any of these or for the core programming environment (e.g., Python, PyTorch).
Experiment Setup Yes We set the sparsity to be similar to previous methods by adjusting the sparsity regularization term (i.e., s) in Equation 4.