Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Costs and Benefits of Fair Regression

Authors: Han Zhao

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To verify these implications, we conduct experiments on a real-world benchmark dataset, the Law School dataset (Wightman, 1998), to present empirical results with various metrics.
Researcher Affiliation	Academia	Han Zhao EMAIL Department of Computer Science University of Illinois Urbana-Champaign
Pseudocode	No	The paper describes the algorithm in prose (Section 3.4) and provides mathematical formulations, but it does not include a clearly labeled pseudocode block or algorithm figure.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository. A URL is provided for a dataset, not code.
Open Datasets	Yes	We conduct experiments on a real-world benchmark dataset, the Law School dataset (Wightman, 1998), to present empirical results with various metrics. We refer readers to Appendix B for further details about the dataset, our pre-processing pipeline and the models used in the experiments. The Law School dataset contains 1,823 records for law students who took the bar passage study for Law School Admission.2 ... 2We use the edited public version of the dataset which can be downloaded here: https://github.com/algowatchpenn/Gerry Fair/blob/master/dataset/lawschool.csv
Dataset Splits	Yes	We use 80 percent of the data as our training set and the rest 20 percent as the test set.
Hardware Specification	Yes	All the experiments are performed on a Titan 1080 GPU.
Software Dependencies	No	The paper mentions software components like 'three hidden-layer feed-forward network with Re LU activations', 'MLP', 'W-MLP', 'gradient descent-ascent algorithm', and 'weight clipping', but it does not provide specific version numbers for any of these or other key software dependencies.
Experiment Setup	Yes	We fix the baseline model to be a three hidden-layer feed-forward network with Re LU activations. The number of units in each hidden layer is 50 and 20, respectively. The output layer corresponds to a linear regression model. ... We vary the coefﬁcient τ for the adversarial loss between 0.1, 1.0, 5.0 and 10.0. ... we ﬁx the coefﬁcient τ = 0.01 and vary the value of ρ by changing the weight clipping value of the model parameters of both the adversary as well as the target predictor. More speciﬁcally, we vary the clipping value for the parameters between 0.01, 0.1, 1.0 and 10.0. ... Throughout the experiments, we ﬁx the learning rate to be 1.0 and use the same networks as well as random seeds. ... The number of units in each hidden layer is 50 and 20, respectively.