Towards Faithful Explanations: Boosting Rationalization with Shortcuts Discovery
Authors: Linan Yue, Qi Liu, Yichao Du, Li Wang, Weibo Gao, Yanqing An
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on real-world datasets clearly validate the effectiveness of our proposed method. Code is released at https://github.com/yuelinan/codes-of-SSR. |
| Researcher Affiliation | Academia | Linan Yue1, Qi Liu1,2 , Yichao Du1, Li Wang3, Weibo Gao1, Yanqing An1 1: State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China 2: Institute of Artificial Intelligence, Hefei Comprehensive National Science Center Hefei, China 3: Byte Dance {lnyue,duyichao,wl063,weibogao,anyq}@mail.ustc.edu.cn; qiliuql@ustc.edu.cn |
| Pseudocode | Yes | Algorithm 1 SSRunif: Injecting Shortcuts into Prediction. ... Algorithm 2 SSRvirt: Virtual Shortcuts Representations. ... Algorithm 3 Semantic Data Augmentation |
| Open Source Code | Yes | Code is released at https://github.com/yuelinan/codes-of-SSR. |
| Open Datasets | Yes | We evaluate SSR on text classification tasks from the ERASER benchmark (De Young et al., 2020), including Movies (Pang & Lee, 2004) for sentiment analysis, Multi RC (Khashabi et al., 2018) for multiple-choice QA, Bool Q (Clark et al., 2019) for reading comprehension, Evidence Inference (Lehman et al., 2019) for medical interventions, and FEVER (Thorne et al., 2018) for fact verification. Each dataset contains human annotated rationales and classification labels. |
| Dataset Splits | Yes | In the semi-supervised setting, we implement our SSR and other semi-supervised rationalization methods with 25% labeled rationales. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running its experiments, only mentioning the use of BERT as encoder. |
| Software Dependencies | No | The paper mentions using BERT and Adam W optimizer, but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | For training, we adopt the Adam W optimizer (Loshchilov & Hutter, 2019) with an initial learning rate as 2e-05, then we set the batch size as 4, maximum sequence length as 512 and training epoch as 30. Besides, we set the predefined sparsity level α as {0.1, 0.2, 0.2, 0.08} for Movies, Multi RC, Bool Q and Evidence Inference, respectively, which is slightly higher than the percentage of rationales in the input text. In the semi-supervised setting, we implement our SSR and other semi-supervised rationalization methods with 25% labeled rationales. In SSR, we set Lunif, Lvirt, and λdiff as 0.1, respectively. |