reproducibilityindex.ai

UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation

Authors: Hanzhang Zhou, Zijian Feng, Zixiao Zhu, Junlang Qian, Kezhi Mao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments across 12 NLP datasets demonstrate that Uni Bias significantly enhances ICL performance and alleviates prompt brittleness of LLMs.
Researcher Affiliation	Academia	Hanzhang Zhou1,2, Zijian Feng1,2, Zixiao Zhu1,2, Junlang Qian1, Kezhi Mao1,2 1Nanyang Technological University 2Singapore-ETH Centre {hanzhang001, feng0119, zixiao001, junlang001}@e.ntu.edu.sg ekzmao@ntu.edu.sg
Pseudocode	No	The paper does not contain a structured pseudocode or algorithm block.
Open Source Code	Yes	The code is available at https://github.com/hzzhou01/Uni Bias.
Open Datasets	Yes	We evaluate our Uni Bias method on 12 diverse natural language processing datasets across various tasks, including sentiment analysis, topic classification, natural language inference, reasoning, and word disambiguation. Statistics and details about the datasets can be found in Table 4 in Appendix. Table 4 lists: SST2 [Socher et al., 2013], MNLI [Williams et al., 2018], Wi C [Pilehvar and Camacho-Collados, 2019], COPA [Roemmele et al., 2011], CR [Hu and Liu, 2004], AGNews [Zhang et al., 2015], MR [Pang and Lee, 2005], RTE [Dagan et al., 2005], SST-5 [Socher et al., 2013], TREC [Voorhees and Tice, 2000], ARC-Challenge [Clark et al., 2018], MMLU [Hendrycks et al., 2020].
Dataset Splits	Yes	In our experiments, we utilize k (where k = 0, 1, 2, 4) training samples per class as prompt examples for k-shot ICL. For testing, we randomly select 2000 samples for MMLU and 3000 samples for MNLI and MR, while employing the original testing sets for other datasets. Additionally, Specifically, we utilize a small subset of training data as a support set, with 20 samples for each class.
Hardware Specification	Yes	Additionally, we conduct the experiment on four A5000 GPUs.
Software Dependencies	No	The paper mentions models like Llama-2 and GPT-J, but does not provide specific version numbers for software dependencies (e.g., Python, deep learning frameworks, or other libraries).
Experiment Setup	Yes	For all experiments, unless stated otherwise, we use 1-shot ICL setting, i.e. one example per class, and repeat five times under different random seeds. We use k = 20 sampes per class as the support set to obtain all threshold values by grid searching, as mentioned in the method section. The prompt template and more implementation details are specified in Appendix A.