Learning Robust Rationales for Model Explainability: A Guidance-Based Approach
Authors: Shuaibo Hu, Kui Yu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on two synthetic settings prove that our method is robust to the rationalization degeneration and failure problems, while the results on two real datasets show its effectiveness in providing rationales in line with human judgments. |
| Researcher Affiliation | Academia | Shuaibo Hu, Kui Yu* School of Computer and Information, Hefei University of Technology shuaibohu@mail.hfut.edu.cn, yukui@hfut.edu.cn |
| Pseudocode | No | The paper describes the proposed method using textual descriptions and mathematical equations, but it does not include pseudocode or an algorithm block. |
| Open Source Code | Yes | The source code is available at https://github.com/shuaibo919/g-rat. |
| Open Datasets | Yes | Following the work of Huang et al. (2021) and Liu et al. (2022), we consider two widely used datasets for selective rationalization. 1) Beer Advocate (Mc Auley, Leskovec, and Jurafsky 2012) contains more than 220,000 beer reviews... 2) Hotel Review (Wang, Lu, and Zhai 2010) is another multi-aspect dataset similar to Beer Advocate. |
| Dataset Splits | No | The paper mentions following settings from previous works and that 'The Appendix has pre-processing settings details', but it does not explicitly provide specific dataset split information (percentages, sample counts, or direct links to splits) in the provided text. |
| Hardware Specification | Yes | Experiments are all conducted on a single Tesla A100 GPU. |
| Software Dependencies | No | The paper mentions using 'Glove' for embeddings, 'GRU' as the encoder, and 'Adam' as the optimizer. However, it does not provide specific version numbers for any libraries or software dependencies like Python or PyTorch. |
| Experiment Setup | Yes | More detailed settings on training and hyperparameters can be found in Appendix. In the previous setup, we set λguide = 5.0 and λmatch = 1.0. This is an empirical choice because these two regularizers can have a similar scale as the task loss Ltask. ... we predefined the sparsity with {15%, 10%, 10%} respectively. ... We linearly decay this coefficient τ until it reaches 0 in training. |