Feature-Wise Bias Amplification
Authors: Klas Leino, Emily Black, Matt Fredrikson, Shayak Sen, Anupam Datta
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on synthetic and real data demonstrate that these algorithms consistently lead to reduced bias without harming accuracy, in some cases eliminating predictive bias altogether while providing modest gains in accuracy. |
| Researcher Affiliation | Academia | Klas Leino, Matt Fredrikson, Emily Black, Shayak Sen, & Anupam Datta Carnegie Mellon University |
| Pseudocode | No | The paper describes algorithms (Feature parity, Experts) through textual explanation and mathematical equations (Equation 6, Equation 7) but does not provide structured pseudocode blocks. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | For example, for a VGG16 network trained on Celeb A (Liu et al., 2015) to predict the attractive label, our approach removed 95% of the bias in predictions. and We created a binary classification problem from CIFAR10 (Krizhevsky & Hinton, 2009) from the bird and frog classes. |
| Dataset Splits | No | Logistic regression measurements were obtained by averaging over 20 pseudorandom training runs on a randomly-selected stratified train/test split. Experiments on deep networks use the training/test split provided by the respective dataset authors. (Explanation: The paper mentions train/test splits, but does not explicitly detail a separate validation set split or how it was used for model tuning beyond implicitly via “training runs” or existing dataset splits.) |
| Hardware Specification | No | The paper discusses software used for training (e.g., Keras 2 with Theano backend, scikit-learn's SGDClassifier estimator) but does not provide any specific details about the hardware specifications (e.g., GPU models, CPU types) used for running the experiments. |
| Software Dependencies | No | For the logistic regression experiments, we used scikit-learn s SGDClassifier estimator to train each model using the logistic loss function. Logistic regression measurements were obtained by averaging over 20 pseudorandom training runs on a randomly-selected stratified train/test split. Experiments involving experts selected α, β using grid search over the possible values that minimize bias subject to not harming accuracy as described in Section 4. Similarly, experiments involving ℓ1 regularization use a grid search to select the regularization paramter, optimizing for the same criteria used to select α, β. Experiments on deep networks use the training/test split provided by the respective dataset authors. Models were trained until convergence using Keras 2 with the Theano backend. (Explanation: The paper mentions "scikit-learn" and "Keras 2 with the Theano backend", but does not provide version numbers for all key software dependencies (e.g., scikit-learn) or the Theano backend, which is required for full reproducibility.) |
| Experiment Setup | Yes | Experiments involving experts selected α, β using grid search over the possible values that minimize bias subject to not harming accuracy as described in Section 4. Similarly, experiments involving ℓ1 regularization use a grid search to select the regularization paramter, optimizing for the same criteria used to select α, β. |