Intra-Processing Methods for Debiasing Neural Networks

Authors: Yash Savani, Colin White, Naveen Sundar Govindarajulu

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate these methods across three popular datasets from the AIF360 toolkit, as well as on the Celeb A faces dataset.In this section, we experimentally evaluate the techniques laid out in Section 4 compared to baselines, on three datasets and with multiple fairness measures.
Researcher Affiliation Collaboration Yash Savani Abacus.AI San Francisco, CA 94103 yash@abacus.ai Colin White Abacus.AI San Francisco, CA 94103 colin@abacus.ai Naveen Sundar Govindarajulu RAIR Lab RPI Troy, NY 12180 naveensundarg@gmail.com
Pseudocode Yes Algorithm 1 Random Perturbation Algorithm 2 Layer-wise optimization Algorithm 3 Adversarial Fine-Tuning
Open Source Code Yes Our code is available at https: //github.com/abacusai/intraprocessing_debiasing. To promote reproducibility, we release our code at https://github.com/abacusai/intraprocessing_debiasing and we use datasets from the AIF360 toolkit [5] and a popular image dataset.
Open Datasets Yes We evaluate these methods across three popular datasets from the AIF360 toolkit, as well as on the Celeb A faces dataset.We run experiments with three fairness datasets from AIF360 [5], as well as the Celeb A dataset [39]
Dataset Splits Yes Given a dataset split into three parts, Dtrain, Dvalid, Dtest An in-processing debiasing algorithm takes as input the training and validation datasets and outputs a model f which seeks to maximize φµ,ρ,ϵ. An intra-processing algorithm takes in the validation dataset and a trained model f with weights θ
Hardware Specification No The paper mentions models trained for 'dozens of GPU hours' but does not specify the hardware used for their own experiments (e.g., specific GPU models, CPUs, or cloud instances).
Software Dependencies No The paper mentions software like
Experiment Setup Yes Our initial model consists of a feed-forward neural network with 10 fully-connected layers of size 32, with a Batch Norm layer between each fully-connected layer, and a dropout fraction of 0.2. The model is trained with the Adam optimizer and an early-stopping patience of 100 epochs. The loss function is the binary cross-entropy loss. We use the validation data as the input for the intra-processing methods, with the objective function set to Equation 1 with ϵ = 0.05. We modified the hyperparameters so that each method took roughly 30 minutes.