reproducibilityindex.ai

Refining Language Models with Compositional Explanations

Authors: Huihan Yao, Ying Chen, Qinyuan Ye, Xisen Jin, Xiang Ren

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of the proposed approach on two text classiﬁcation tasks by showing improved performance in target domain as well as improved model fairness after reﬁnement. We validate our approach on three pairs of datasets in hate speech classiﬁcation and sentiment analysis. Compared with direct transfer (evaluate the source model on target domain data) and other baselines (distillation and weight regularization), we observe notable performance improvements after reﬁning the model with our proposed framework.
Researcher Affiliation	Academia	Huihan Yao1 Ying Chen2 Qinyuan Ye3 Xisen Jin3 Xiang Ren3 1Peking University 2Tsinghua University 3University of Southern California yaohuihan@pku.edu.cn chenying17@mails.tsinghua.edu.cn {qinyuany,xisenjin,xiangren}@usc.edu
Pseudocode	No	The paper describes processes and modules but does not include any structured pseudocode or algorithm blocks clearly labeled as such.
Open Source Code	Yes	Code and data are available at https://github.com/INK-USC/expl-reﬁnement.
Open Datasets	Yes	For hate speech detection, we use Stormfront [2] and Hat Eval [1] as upstream datasets, and the Gab Hate Corpus (GHC) [20] as the downstream dataset. For sentiment analysis, we ﬁrst train on Amazon Music [16], and apply the model to the Stanford Sentiment Treebank-2 (SST-2) dataset [40].
Dataset Splits	No	The paper mentions tuning parameters 'based on the performance on dev sets' but does not explicitly provide specific percentages or sample counts for training/validation/test dataset splits in the main text.
Hardware Specification	No	The paper states 'the total amount of compute and the type of resources used' is in Appendix A, but this information is not provided in the main body of the paper.
Software Dependencies	No	The paper mentions using 'BERT-Large', 'RoBERTa-Base', 'BERT-Base', 'Bi LSTM+Attention', 'SOC', and 'Adam optimizer'. It also states in Appendix A that 'Our approach is implemented in PyTorch' and 'We implemented our framework using the Transformers library'. However, specific version numbers for these software dependencies are not provided in the main text or the mentioned Appendix A sections.
Experiment Setup	No	The paper states 'Detailed experiment settings about hyper-parameters are included in Appendix A.' but does not provide these specific details in the main text.