Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A Generic Framework for Conformal Fairness

Authors: Aditya Vadlamani, Anutam Srinivasan, Pranav Maneriker, Ali Payani, srinivasan parthasarathy

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments were conducted on graph and tabular datasets to demonstrate that the algorithm can control fairness-related gaps in addition to coverage aligned with theoretical expectations. To evaluate the CF Framework, we used five multi-class datasets: Pokec-n (Takac & Zabovsky, 2012), Pokec-z (Takac & Zabovsky, 2012), Credit (Agarwal et al., 2021), ACSIncome (Ding et al., 2021), and ACSEducation (Ding et al., 2021) (see Table 2 for details).
Researcher Affiliation	Collaboration	Aditya T. Vadlamani 1,, Anutam Srinivasan 1,, Pranav Maneriker 2, , Ali Payani 3, Srinivasan Parthasarathy 1 1The Ohio State University 2Dolby Laboratories 3Cisco Systems
Pseudocode	Yes	Algorithm 1 Conformal Fairness Framework
Open Source Code	Yes	In the interest of reproducibility, the source code for the CF Framework is provided in the supplementary material. The source code is available at https://github.com/AdityaVadlamani/conformal-fairness.
Open Datasets	Yes	Datasets: To evaluate the CF Framework, we used five multi-class datasets: Pokec-n (Takac & Zabovsky, 2012), Pokec-z (Takac & Zabovsky, 2012), Credit (Agarwal et al., 2021), ACSIncome (Ding et al., 2021), and ACSEducation (Ding et al., 2021) (see Table 2 for details).
Dataset Splits	Yes	For each dataset, we use a 30%/20%/25%/25% stratified split of the labeled points for Dtrain/Dvalid/Dcalib/Dtest.
Hardware Specification	Yes	All experiments were run on a single P100 GPU.
Software Dependencies	No	Hyperparameter tuning was done using Ray Tune (Liaw et al., 2018). The paper mentions a software tool (Ray Tune) but does not provide specific version numbers for it or any other libraries/frameworks used for the implementation.
Experiment Setup	Yes	Hyperparameter tuning was done using Ray Tune (Liaw et al., 2018). For the Pokec n and Pokec z datasets, hyperparameters for the base GNN models were tuned via random search using Table D1 for each model type (i.e., GCN, GAT, and Graph SAGE)... For the Credit, ACS Income, and ACS Education datasets, the base XGBoost models were tuned via random search using Table D2... For Credit, Pokec n, and Pokec z, we tune the hyperparameters for the CFGNN model via random search using Table D3...