Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Fairness Transferability Subject to Bounded Distribution Shift
Authors: Yatong Chen, Reilly Raab, Jialu Wang, Yang Liu
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we compare our theoretical bounds to deterministic models of distribution shift and against real-world data, finding that we are able to estimate fairness violation bounds in practice, even when simplifying assumptions are only approximately satisfied. |
| Researcher Affiliation | Academia | University of California, Santa Cruz EMAIL |
| Pseudocode | No | The paper presents mathematical formulations and theorems but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | The details for reproducing our experimental results can be found at https://github.com/UCSC-REAL/Fairness_Transferability. |
| Open Datasets | Yes | We use American Community Survey (ACS) data provided by the US Census Bureau [16]. We adopt the sampling and pre-processing approaches following the Folktables package provided by Ding et al. [13] to obtain 1,599,229 data points. |
| Dataset Splits | No | The paper states it trains models on "source distribution" and evaluates on "target distribution" based on geographic and temporal shifts (Section 7). However, it does not provide specific percentages or counts for how the initial 1,599,229 data points (or any subset thereof) were split into traditional training, validation, and test sets for the model building process itself. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, or cloud computing instances with specifications) used to run the experiments. |
| Software Dependencies | No | The paper mentions using the "Folktables package" (Footnote 6), but it does not specify version numbers for this or any other software dependencies, libraries, or programming languages used in the experiments. |
| Experiment Setup | No | The paper mentions training "group-dependent, linear threshold classifiers" with "a range of thresholds τg and τh" (Section 7). However, it does not provide specific hyperparameter values such as learning rate, batch size, number of epochs, or optimizer settings necessary to reproduce the training process. |