Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Fair Classifiers that Abstain without Harm
Authors: Tongxin Yin, Jean-Francois Ton, Ruocheng Guo, Yuanshun Yao, Mingyan Liu, Yang Liu
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have carried out extensive experiments to demonstrate the benefits of our solution compared to strong existing baselines. |
| Researcher Affiliation | Collaboration | 1 University of Michigan 2 Byte Dance Research EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Prediction Adjustment |
| Open Source Code | Yes | Code: https://github.com/tsy19/FAN |
| Open Datasets | Yes | We adopt three real-world datasets: Adult (Dua & Graff, 2017), Compas (Bellamy et al., 2018), and Law (Bellamy et al., 2018). |
| Dataset Splits | Yes | Table 5: Size of train, val, test data of each dataset. |
| Hardware Specification | Yes | We run the experiments on a single T100 GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | The architecture configuration for the Adult dataset consists of two layers, each with a dimension of 300. For the Compas and Law datasets, we employed two layers, each with a dimension of 100. A dropout layer with a dropout probability of 0.5 was applied between the two hidden layers. The Rectified Linear Unit (Re LU) function was used as the activation function. |