Refining Language Models with Compositional Explanations
Authors: Huihan Yao, Ying Chen, Qinyuan Ye, Xisen Jin, Xiang Ren
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of the proposed approach on two text classification tasks by showing improved performance in target domain as well as improved model fairness after refinement. We validate our approach on three pairs of datasets in hate speech classification and sentiment analysis. Compared with direct transfer (evaluate the source model on target domain data) and other baselines (distillation and weight regularization), we observe notable performance improvements after refining the model with our proposed framework. |
| Researcher Affiliation | Academia | Huihan Yao1 Ying Chen2 Qinyuan Ye3 Xisen Jin3 Xiang Ren3 1Peking University 2Tsinghua University 3University of Southern California yaohuihan@pku.edu.cn chenying17@mails.tsinghua.edu.cn {qinyuany,xisenjin,xiangren}@usc.edu |
| Pseudocode | No | The paper describes processes and modules but does not include any structured pseudocode or algorithm blocks clearly labeled as such. |
| Open Source Code | Yes | Code and data are available at https://github.com/INK-USC/expl-refinement. |
| Open Datasets | Yes | For hate speech detection, we use Stormfront [2] and Hat Eval [1] as upstream datasets, and the Gab Hate Corpus (GHC) [20] as the downstream dataset. For sentiment analysis, we first train on Amazon Music [16], and apply the model to the Stanford Sentiment Treebank-2 (SST-2) dataset [40]. |
| Dataset Splits | No | The paper mentions tuning parameters 'based on the performance on dev sets' but does not explicitly provide specific percentages or sample counts for training/validation/test dataset splits in the main text. |
| Hardware Specification | No | The paper states 'the total amount of compute and the type of resources used' is in Appendix A, but this information is not provided in the main body of the paper. |
| Software Dependencies | No | The paper mentions using 'BERT-Large', 'RoBERTa-Base', 'BERT-Base', 'Bi LSTM+Attention', 'SOC', and 'Adam optimizer'. It also states in Appendix A that 'Our approach is implemented in PyTorch' and 'We implemented our framework using the Transformers library'. However, specific version numbers for these software dependencies are not provided in the main text or the mentioned Appendix A sections. |
| Experiment Setup | No | The paper states 'Detailed experiment settings about hyper-parameters are included in Appendix A.' but does not provide these specific details in the main text. |