Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization

Authors: Zonghan Yang, Xiaoyuan Yi, Peng Li, Yang Liu, Xing Xie

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that compared to several strong baselines, UDDIA achieves debiasing and detoxifying simultaneously and better balances efficiency and effectiveness, taking a further step towards practical ethical NLG.
Researcher Affiliation Collaboration 1 Dept. of Comp. Sci. & Tech., Institute for AI, Tsinghua University, Beijing, China; 3 Microsoft Research Asia;
Pseudocode Yes B ALGORITHM PSEUDO-CODE Algorithm 2 Our UDDIA framework during generation process
Open Source Code No The paper states 'Our implementation is based on the code base of DExperts with Apache-2.0 license' but does not explicitly provide a link or statement for the release of their own modified code for UDDIA.
Open Datasets Yes Data. Following (Sheng et al., 2019; 2020; Liang et al., 2021), we take the prompts mentioning specified demographic groups as input and evaluate the biases of the generated texts. ... we use two sets of prompts, simple set ... and diverse set ... For example, in our experiments, we follow the settings used in previous works to use contexts from Real Toxicity Prompts and set generation length = 20 for detoxifying experiments. ... We use the prompts in BOLD (Dhamala et al., 2021).
Dataset Splits No The paper does not provide specific training/validation/test dataset splits for model training, as its approach involves inference-time adaptive optimization on pre-trained models and uses datasets primarily for evaluation or hyperparameter tuning.
Hardware Specification Yes Efficiency Metrics. To verify the effectiveness of our model, we compare generation speed (seconds per 100 tokens) and GPU memory usage of different methods on one single Tesla P100 GPU. ... All experiments are conducted on a single NVIDIA 3090 GPU.
Software Dependencies No The paper mentions using Huggingface for implementation, but does not provide specific version numbers for any software libraries or dependencies, such as PyTorch, TensorFlow, or scikit-learn.
Experiment Setup Yes We use Adam W ... with learning rate 3e-3 (for gender) and 3.5e-3 (for race) for optimization. τ in Sec.3.3 is 0.12 for gender and 0.015 for race, and batch size is 1. ... k = 40, p = 0.9, temperature= 0.7, and the maximum sequence length is 30 tokens. ... The learning rate is set as 6e-2, ... We set the batch size as 1 ... We set T0 = L/2 = 18 and T = 3 in the redo mechanism.