reproducibilityindex.ai

Equal Opportunity in Online Classification with Partial Feedback

Authors: Yahav Bechavod, Katrina Ligett, Aaron Roth, Bo Waggoner, Steven Z. Wu

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study an online classiﬁcation problem with partial feedback in which individuals arrive one at a time from a ﬁxed but unknown distribution, and must be classiﬁed as positive or negative. Our algorithm only observes the true label of an individual if they are given a positive classiﬁcation. This setting captures many classiﬁcation problems for which fairness is a concern: for example, in criminal recidivism prediction, recidivism is only observed if the inmate is released; in lending applications, loan repayment is only observed if the loan is granted. We require that our algorithms satisfy common statistical fairness constraints (such as equalizing false positive or negative rates introduced as equal opportunity in [18]) at every round, with respect to the underlying distribution. We give upper and lower bounds characterizing the cost of this constraint in terms of the regret rate (and show that it is mild), and give an oracle efﬁcient algorithm that achieves the upper bound.
Researcher Affiliation	Academia	Yahav Bechavod Hebrew University yahav.bechavod@cs.huji.ac.il Katrina Ligett Hebrew University katrina@cs.huji.ac.il Aaron Roth University of Pennsylvania aaroth@cis.upenn.edu Bo Waggoner University of Colorado bwag@colorado.edu Zhiwei Steven Wu University of Minnesota zsw@umn.edu
Pseudocode	Yes	We describe the feasibility problem solved at each step. The approach and analysis directly follow and extend that of [2]. In that work, the ﬁrst step at each round is to compute the best policy so far, which lets us compute d Regt( ) and bt( ) for any policy . Here, our Fair CSC oracle only computes approximate solutions, and so we can only compute regret relative to the approximately best policy so far, which leads to corresponding approximations g Regt( ) and bt( ). Then, our algorithm solves the same feasibility program (although a few more technicalities must be handled): given history Ht (in the second phase) and minimum probability µt, ﬁnd a probability distribution Q over such that Z Q( ) bt 1( )d 4 (Low regret) 8 2 : E x Ht 1 Qµt( (x) \| x) 4 + bt 1( ) (Low variance) Intuitively, the ﬁrst constraint ensures that the estimated regret (based on historical data) of the solution is at most O(1/t). The second constraint bounds the variance of the resulting IPS loss estimator for policies in , which in turn allows us to bound the deviation between the empirical regret and the true regret for each policy over time. Importantly, we impose a tighter variance constraint on policies that have lower empirical regret so far, which prioritizes their regret estimation. To solve the feasibility program using our Fair CSC oracle, we will run a coordinate descent algorithm, similar to [2] (full description in Section B.3 as Algorithm 1).
Open Source Code	No	The full version of this paper is available at https://arxiv.org/abs/1902.02242. This link points to the paper itself on arXiv, not a code repository. The paper does not contain any other statements or links regarding the availability of open-source code.
Open Datasets	No	The paper is theoretical and does not specify the use of any publicly available or open datasets for empirical training or evaluation. It refers to an "unknown distribution D" and conceptual "collected data" during an exploration phase for theoretical analysis.
Dataset Splits	No	The paper is theoretical and does not describe any empirical evaluation or dataset splits (training, validation, test) for reproduction. It focuses on theoretical bounds and algorithm properties.
Hardware Specification	No	The paper is theoretical and does not describe any experiments that would require specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not describe specific software dependencies with version numbers for implementation or experimentation.
Experiment Setup	No	The paper is theoretical and does not describe any experimental setup details such as hyperparameters, training configurations, or system-level settings, as it does not conduct empirical experiments.