Position Debiasing Fine-Tuning for Causal Perception in Long-Term Dialogue
Authors: Shixuan Fan, Wei Wei, Wendi Li, Xian-Ling Mao, Wenfeng Xie, Dangyang Chen
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on two datasets prove that our proposed method can effectively alleviate the position bias for multiple LLMs and achieve significant progress compared with existing baselines. |
| Researcher Affiliation | Collaboration | 1Cognitive Computing and Intelligent Information Processing (CCIIP) Laboratory, School of Computer Science and Technology, Huazhong University of Science and Technology 2Joint Laboratory of HUST and Pingan Property & Casualty Research (HPL) 3Department of Computer Science and Technology, Beijing Institute of Technology 4Ping An Property & Casualty Insurance Company of China, Ltd {fanshixuan, weiw, wendili}@hust.edu.cn, maoxl@bit.edu.cn, julian wind@163.com, chendangyang273@pingan.com.cn |
| Pseudocode | No | The paper describes its methods using prose and mathematical equations but does not include formal pseudocode blocks or algorithms. |
| Open Source Code | No | The paper does not provide any explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | To evaluate the effectiveness of our proposed method, following previous works [Wang et al., 2023; Feng et al., 2023], we conduct experiments on two widely used benchmark datasets, ESConv [Liu et al., 2021] and MSC [Xu et al., 2022a], for long-term dialogue. We use the same data preprocessing and train/valid/test splitting strategy as in [Feng et al., 2023]. |
| Dataset Splits | Yes | We use the same data preprocessing and train/valid/test splitting strategy as in [Feng et al., 2023]. |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU models, CPU models, or memory specifications used for the experiments. It only mentions using Llama2-7B-chat and Qwen-14B-chat, which are models, not hardware. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer', 'lora', and 'bitsandbytes' but does not provide specific version numbers for these software dependencies (e.g., PyTorch 1.x, bitsandbytes 0.x.x). |
| Experiment Setup | Yes | Throughout the experiments, we use Adam optimizer [Kingma and Ba, 2015] with 3e-4 initial learning rate and the 128 batch size. All methods are trained for up to 12 epochs. To improve experimental efficiency, we use lora [Hu et al., 2021] with rank 32 to fine-tune large language models. Both training and inference use 4-bit weight quantization by bitsandbytes [Dettmers et al., 2022]. |