reproducibilityindex.ai

Towards Poisoning Fair Representations

Authors: Tianci Liu, Haoyu Wang, Feijie Wu, Hengtong Zhang, Pan Li, Lu Su, Jing Gao

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on benchmark fairness datasets and state-of-the-art fair representation learning models demonstrate the superiority of our attack. Extensive experimental results in Section 4 demonstrate high effectiveness of our attacks on four representative FRL methods using as few as 5% of training data for poisoning.
Researcher Affiliation	Collaboration	Tianci Liu1, Haoyu Wang1, Feijie Wu1, Hengtong Zhang2, Pan Li3, Lu Su1, Jing Gao1 1Purdue University 2Tencent AI Lab 3Georgia Institute of Technology 1{liu3351,wang5346,wu1977,lusu,jinggao}@purdue.edu 2htzhang.work@gmail.com 3panli@gatech.edu
Pseudocode	Yes	Algorithm 1 Craft Poisoning Samples with ENG Attack
Open Source Code	No	The paper does not provide a specific link or an explicit statement about the availability of the source code for its methodology.
Open Datasets	Yes	We train victims on two benchmark datasets from UCI repository that are extensively studied in fair machine learning, which are pre-processed3 following Zhao et al. (2019); Reddy et al. (2021). Adult (Kohavi, 1996) contains 48,842 samples of US census data with 112 features... German (Dua & Graff, 2017) consists of 1,000 samples of personal financial data with 62 features...
Dataset Splits	No	The paper mentions leaving 20% of samples as 'Dta' (target data) for evaluation, but it does not specify explicit train/validation/test splits, particularly for a separate validation set used during model training or hyperparameter tuning.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU specifications, or cloud computing instance types.
Software Dependencies	No	The paper mentions optimizers (e.g., Ada Delta, Adam) and activation functions (e.g., ReLU, Tanh) but does not provide specific version numbers for software libraries or dependencies used in the experiments.
Experiment Setup	Yes	Encoder: linear, representation z R60. Discriminators: one hidden layer with width 50, using Re LU activation. Classifier: linear. Training: Ada Delta optimizer with learning rate 0.1, batchsize 512, epochs 50. During training, we followed Grad Match (Geiping et al., 2020) and did not shuffle training data after each epoch. For better comparison, victims were always initialized with random seed 1 to remove randomness during the pre-training procedure. In different replications, we selected different poisoning samples with different random seeds. Experiments that consist of 5 replications used seed 1 to 5 respectively.