reproducibilityindex.ai

Confidential-PROFITT: Confidential PROof of FaIr Training of Trees

Authors: Ali Shahin Shamsabadi, Sierra Calanda Wyllie, Nicholas Franzese, Natalie Dullerud, Sébastien Gambs, Nicolas Papernot, Xiao Wang, Adrian Weller

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirically that bounding the information gain of each node with respect to the sensitive attributes reduces the unfairness of the final tree. In extensive experiments on the COMPAS, Communities and Crime, Default Credit, and Adult datasets, we demonstrate that a company can use Confidential-PROFITT to certify the fairness of their decision tree to an auditor in less than 2 minutes, thus indicating the applicability of our approach.
Researcher Affiliation	Collaboration	Ali Shahin Shamsabadi1, Sierra Wyllie2, 3, Nicholas Franzese4, Natalie Dullerud2, 3 Sébastien Gambs 5, Nicolas Papernot2, 3, Xiao Wang4, Adrian Weller1, 6 1 The Alan Turing Institute, 2 University of Toronto, 3 Vector Institute, 4 Northwestern University, 5 Université du Québec à Montréal, 6 University of Cambridge
Pseudocode	Yes	Algorithm 1: Finding the best split for each node using our fair learning algorithm. Algorithm 2: ZK proof of demographic parity fair tree training. For equalized odds fair tree training see Appendix F. Algorithm 3: Recursively building decision tree. Algorithm 4: Finding the best (fairness-oblivious) split for each node. Algorithm 5: Zero-knowledge proof of equalized odds-aware tree training. Algorithm 6: ZK proof of fair training of a random forest.
Open Source Code	Yes	The code is available at https://github.com/cleverhanslab/Confidential-PROFITT.
Open Datasets	Yes	We assess the performance of Confidential-PROFITT using four common datasets for fairness benchmarking: COMPAS (Angwin et al., 2016), Communities and Crime (Redmond, 2009), Adult Income (Adu, 1996), and Default Credit (Def, 2016).
Dataset Splits	Yes	We evaluate fairness and accuracy using Fairlearn (Bird et al., 2020) and Sci Py (Virtanen et al., 2020) over a testing set using a test-train split of 75% : 25%.
Hardware Specification	Yes	We use EMP-toolkit (Wang et al., 2016) to implement our ZK protocol. EMP is written in C++ and offers efficient implementations of ZK protocols. This code base is used for timing results (benchmarking the efficiency of our ZK protocol) and conducted using two Amazon EC2 c6a.2xlarge machines to represent the prover and verifier.
Software Dependencies	No	The paper mentions "EMP-toolkit (Wang et al., 2016)", "JSAT (Raff, 2017)", "Fairlearn (Bird et al., 2020)", and "Sci Py (Virtanen et al., 2020)" but only specifies the Java version for JSAT ("Java (v. 14.0.2)"). It does not provide specific versions for other libraries.
Experiment Setup	Yes	To evaluate Confidential-PROFITT, we train decision trees and random forests for 250 values of τ with 10 random seeds each. For each dataset, we set the height of the tree by observing test and training set results in a decision tree trained without fairness. We choose the smallest height that maintains accuracy without overfitting. These heights are reported for each dataset in Table 4.