Metric-Fair Active Learning

Authors: Jie Shen, Nan Cui, Jing Wang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we henceforth study metric-fair active learning of homogeneous halfspaces, and show that under the distribution-dependent PAC learning model, fairness and label efficiency can be achieved simultaneously. We further propose two extensions of our main results: 1) we show that it is possible to make the algorithm robust to the adversarial noise one of the most challenging noise models in learning theory; and 2) it is possible to significantly improve the label complexity when the underlying halfspace is sparse.
Researcher Affiliation Collaboration 1Department of Computer Science, Stevens Institute of Technology, Hoboken, New Jersey, USA. 2Amazon, New York City, New York, USA.
Pseudocode Yes Algorithm 1 Active Learning of Halfspaces with Approximate Metric-Fairness ... Algorithm 2 Active Learning of Sparse Halfspaces with Approximate Metric-Fairness
Open Source Code No The paper does not include any explicit statements about releasing source code or provide links to a code repository.
Open Datasets No The paper is theoretical and focuses on a 'distribution-dependent PAC learning model' and properties of 'marginal distribution DX' or 'distribution D', rather than using a specific publicly available dataset for empirical training.
Dataset Splits No The paper is theoretical and does not describe empirical experiments with training, validation, or test dataset splits.
Hardware Specification No The paper is theoretical and focuses on algorithmic design and proofs, without performing empirical experiments that would require specific hardware. No hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe computational experiments that would require specific software dependencies with version numbers.
Experiment Setup Yes Our algorithms proceed in phases. In each phase k, we set rk = 2^(k-3), bk = c rk, ρk = 2t rk, τk = κ bk, ... We will draw a set of instances Tk by calling EXD for nk times, where nk = K1 / (alpha^2) (t log^4 d / (delta_k)) ... We will then query the label of some instances in Tk by calling EXY D for mk times, where mk = K2 t log^3 d / (alpha*delta_k) log d. ... Algorithm 1 ... Require: Target classification error rate ϵ, failure probability δ, fairness metric function ζ( , ), fairness error rate α