Interpretable Debiasing of Vectorized Language Representations with Iterative Orthogonalization

Authors: Prince Osei Aboagye, Yan Zheng, Jack Shunn, Chin-Chia Michael Yeh, Junpeng Wang, Zhongfang Zhuang, Huiyuan Chen, Liang Wang, Wei Zhang, Jeff Phillips

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We first evaluate the effectiveness of ISR in two ways: how well it actually rectifies or orthogonalizes concepts and how well it reduces bias. The WEAT scores are in Table 1, evaluated on the same words used as input to the algorithms.
Researcher Affiliation Collaboration Prince Osei Aboagye1, Yan Zheng2, Jack Shunn1, Chin-Chia Michael Yeh2, Junpeng Wang2, Zhongfang Zhuang2, Huiyuan Chen2, Liang Wang2, Wei Zhang2, Jeff M. Phillips1 1University of Utah, 2Visa Research
Pseudocode Yes Algorithm 1 3-ISR(D, (A, B), (X, Y ), (R, S))
Open Source Code Yes We provide code at https://github.com/poaboagye/ ISR-Iterative Subspace Rectification.
Open Datasets Yes Word lists. Our methods and evaluation methods rely on word lists (and their vectorized forms, unless stated otherwise 300-dimensional GloVe on English Wikipedia (Pennington et al., 2014)). The details and word lists are in Appendix F.
Dataset Splits No In the following experiments, we perform a 50/50 test/train split on each word list. (No explicit mention of validation split)
Hardware Specification Yes Hardware specifications are NVIDIA GeForce GTX Titan XP 12GB, AMD Ryzen 7 1700 eight-core processor, and 62.8GB RAM.
Software Dependencies No The paper mentions that debiasing models run on a CPU and refers to publicly available codes for baselines, but it does not specify any software names with version numbers for its own implementation or general libraries.
Experiment Setup Yes We use 10 iterations of subspace rectification; typically, 2-4 is fine. It learns concepts from each pair using their means µ(A), µ(B), µ(X), and µ(Y ) and then the vectors between them v1 = µ(A) µ(B) and v2 = µ(X) µ(Y ).