Towards Poisoning Fair Representations
Authors: Tianci Liu, Haoyu Wang, Feijie Wu, Hengtong Zhang, Pan Li, Lu Su, Jing Gao
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on benchmark fairness datasets and state-of-the-art fair representation learning models demonstrate the superiority of our attack. Extensive experimental results in Section 4 demonstrate high effectiveness of our attacks on four representative FRL methods using as few as 5% of training data for poisoning. |
| Researcher Affiliation | Collaboration | Tianci Liu1, Haoyu Wang1, Feijie Wu1, Hengtong Zhang2, Pan Li3, Lu Su1, Jing Gao1 1Purdue University 2Tencent AI Lab 3Georgia Institute of Technology 1{liu3351,wang5346,wu1977,lusu,jinggao}@purdue.edu 2htzhang.work@gmail.com 3panli@gatech.edu |
| Pseudocode | Yes | Algorithm 1 Craft Poisoning Samples with ENG Attack |
| Open Source Code | No | The paper does not provide a specific link or an explicit statement about the availability of the source code for its methodology. |
| Open Datasets | Yes | We train victims on two benchmark datasets from UCI repository that are extensively studied in fair machine learning, which are pre-processed3 following Zhao et al. (2019); Reddy et al. (2021). Adult (Kohavi, 1996) contains 48,842 samples of US census data with 112 features... German (Dua & Graff, 2017) consists of 1,000 samples of personal financial data with 62 features... |
| Dataset Splits | No | The paper mentions leaving 20% of samples as 'Dta' (target data) for evaluation, but it does not specify explicit train/validation/test splits, particularly for a separate validation set used during model training or hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU specifications, or cloud computing instance types. |
| Software Dependencies | No | The paper mentions optimizers (e.g., Ada Delta, Adam) and activation functions (e.g., ReLU, Tanh) but does not provide specific version numbers for software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | Encoder: linear, representation z R60. Discriminators: one hidden layer with width 50, using Re LU activation. Classifier: linear. Training: Ada Delta optimizer with learning rate 0.1, batchsize 512, epochs 50. During training, we followed Grad Match (Geiping et al., 2020) and did not shuffle training data after each epoch. For better comparison, victims were always initialized with random seed 1 to remove randomness during the pre-training procedure. In different replications, we selected different poisoning samples with different random seeds. Experiments that consist of 5 replications used seed 1 to 5 respectively. |