Intra-Processing Methods for Debiasing Neural Networks
Authors: Yash Savani, Colin White, Naveen Sundar Govindarajulu
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate these methods across three popular datasets from the AIF360 toolkit, as well as on the Celeb A faces dataset.In this section, we experimentally evaluate the techniques laid out in Section 4 compared to baselines, on three datasets and with multiple fairness measures. |
| Researcher Affiliation | Collaboration | Yash Savani Abacus.AI San Francisco, CA 94103 yash@abacus.ai Colin White Abacus.AI San Francisco, CA 94103 colin@abacus.ai Naveen Sundar Govindarajulu RAIR Lab RPI Troy, NY 12180 naveensundarg@gmail.com |
| Pseudocode | Yes | Algorithm 1 Random Perturbation Algorithm 2 Layer-wise optimization Algorithm 3 Adversarial Fine-Tuning |
| Open Source Code | Yes | Our code is available at https: //github.com/abacusai/intraprocessing_debiasing. To promote reproducibility, we release our code at https://github.com/abacusai/intraprocessing_debiasing and we use datasets from the AIF360 toolkit [5] and a popular image dataset. |
| Open Datasets | Yes | We evaluate these methods across three popular datasets from the AIF360 toolkit, as well as on the Celeb A faces dataset.We run experiments with three fairness datasets from AIF360 [5], as well as the Celeb A dataset [39] |
| Dataset Splits | Yes | Given a dataset split into three parts, Dtrain, Dvalid, Dtest An in-processing debiasing algorithm takes as input the training and validation datasets and outputs a model f which seeks to maximize φµ,ρ,ϵ. An intra-processing algorithm takes in the validation dataset and a trained model f with weights θ |
| Hardware Specification | No | The paper mentions models trained for 'dozens of GPU hours' but does not specify the hardware used for their own experiments (e.g., specific GPU models, CPUs, or cloud instances). |
| Software Dependencies | No | The paper mentions software like |
| Experiment Setup | Yes | Our initial model consists of a feed-forward neural network with 10 fully-connected layers of size 32, with a Batch Norm layer between each fully-connected layer, and a dropout fraction of 0.2. The model is trained with the Adam optimizer and an early-stopping patience of 100 epochs. The loss function is the binary cross-entropy loss. We use the validation data as the input for the intra-processing methods, with the objective function set to Equation 1 with ϵ = 0.05. We modified the hyperparameters so that each method took roughly 30 minutes. |