Interpretable Debiasing of Vectorized Language Representations with Iterative Orthogonalization
Authors: Prince Osei Aboagye, Yan Zheng, Jack Shunn, Chin-Chia Michael Yeh, Junpeng Wang, Zhongfang Zhuang, Huiyuan Chen, Liang Wang, Wei Zhang, Jeff Phillips
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first evaluate the effectiveness of ISR in two ways: how well it actually rectifies or orthogonalizes concepts and how well it reduces bias. The WEAT scores are in Table 1, evaluated on the same words used as input to the algorithms. |
| Researcher Affiliation | Collaboration | Prince Osei Aboagye1, Yan Zheng2, Jack Shunn1, Chin-Chia Michael Yeh2, Junpeng Wang2, Zhongfang Zhuang2, Huiyuan Chen2, Liang Wang2, Wei Zhang2, Jeff M. Phillips1 1University of Utah, 2Visa Research |
| Pseudocode | Yes | Algorithm 1 3-ISR(D, (A, B), (X, Y ), (R, S)) |
| Open Source Code | Yes | We provide code at https://github.com/poaboagye/ ISR-Iterative Subspace Rectification. |
| Open Datasets | Yes | Word lists. Our methods and evaluation methods rely on word lists (and their vectorized forms, unless stated otherwise 300-dimensional GloVe on English Wikipedia (Pennington et al., 2014)). The details and word lists are in Appendix F. |
| Dataset Splits | No | In the following experiments, we perform a 50/50 test/train split on each word list. (No explicit mention of validation split) |
| Hardware Specification | Yes | Hardware specifications are NVIDIA GeForce GTX Titan XP 12GB, AMD Ryzen 7 1700 eight-core processor, and 62.8GB RAM. |
| Software Dependencies | No | The paper mentions that debiasing models run on a CPU and refers to publicly available codes for baselines, but it does not specify any software names with version numbers for its own implementation or general libraries. |
| Experiment Setup | Yes | We use 10 iterations of subspace rectification; typically, 2-4 is fine. It learns concepts from each pair using their means µ(A), µ(B), µ(X), and µ(Y ) and then the vectors between them v1 = µ(A) µ(B) and v2 = µ(X) µ(Y ). |